linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
@ 2018-06-12 12:00 Daniel Lezcano
  2018-06-12 12:30 ` Peter Zijlstra
                   ` (4 more replies)
  0 siblings, 5 replies; 27+ messages in thread
From: Daniel Lezcano @ 2018-06-12 12:00 UTC (permalink / raw)
  To: viresh.kumar, rjw
  Cc: linux-kernel, linux-pm, Eduardo Valentin, Javi Merino, Leo Yan,
	Kevin Wangtao, Vincent Guittot, Rui Zhang, Daniel Thompson,
	Peter Zijlstra, Andrea Parri

Initially, the cpu_cooling device for ARM was changed by adding a new
policy inserting idle cycles. The intel_powerclamp driver does a
similar action.

Instead of implementing idle injections privately in the cpu_cooling
device, move the idle injection code in a dedicated framework and give
the opportunity to other frameworks to make use of it.

The framework relies on the smpboot kthreads which handles via its
main loop the common code for hotplugging and [un]parking.

This code was previously tested with the cpu cooling device and went
through several iterations. It results now in split code and API
exported in the header file. It was tested with the cpu cooling device
with success.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Javi Merino <javi.merino@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Kevin Wangtao <kevin.wangtao@linaro.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrea Parri <andrea.parri@amarulasolutions.com>
---
V6:
   - Move count to wake up function (Viresh Kumar)
   - Split atomic/non-atomic context to wake up tasks
   - Add the park callback to handle unplug/inject race
   - Replace atomic by READ_ONCE and WRITE_ONCE (Peter Zijlstra)

V5:
   - Move init_completion in the init function (Viresh Kumar)
   - Increment task count in the wakeup function (Viresh Kumar)
   - Remove rollback at init time (Viresh Kumar)

V4:
   - Wait for completion when stopping (Viresh Kumar)
   - Check init already done and rollback (Viresh Kumar)

V3:
   - Fixed typos (Viresh Kumar)
   - Removed extra blank line (Viresh Kumar)
   - Added full stop (Viresh Kumar)
   - Fixed Return kerneldoc format (Viresh Kumar)
   - Fixed multiple kthreads initialization (Viresh Kumar)
   - Fixed rollbacking the actions in the unregister function (Viresh Kumar)
   - Make idle injection kthreads name hardcoded
   - Kthreads are created in the initcall process

 V2: Fixed checkpatch warnings

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 drivers/powercap/Kconfig          |  10 +
 drivers/powercap/Makefile         |   1 +
 drivers/powercap/idle_injection.c | 414 ++++++++++++++++++++++++++++++++++++++
 include/linux/idle_injection.h    |  29 +++
 4 files changed, 454 insertions(+)
 create mode 100644 drivers/powercap/idle_injection.c
 create mode 100644 include/linux/idle_injection.h

diff --git a/drivers/powercap/Kconfig b/drivers/powercap/Kconfig
index 85727ef..a767ef2 100644
--- a/drivers/powercap/Kconfig
+++ b/drivers/powercap/Kconfig
@@ -29,4 +29,14 @@ config INTEL_RAPL
 	  controller, CPU core (Power Plance 0), graphics uncore (Power Plane
 	  1), etc.
 
+config IDLE_INJECTION
+	bool "Idle injection framework"
+	depends on CPU_IDLE
+	default n
+	help
+	  This enables support for the idle injection framework. It
+	  provides a way to force idle periods on a set of specified
+	  CPUs for power capping. Idle period can be injected
+	  synchronously on a set of specified CPUs or alternatively
+	  on a per CPU basis.
 endif
diff --git a/drivers/powercap/Makefile b/drivers/powercap/Makefile
index 0a21ef3..c3bbfee 100644
--- a/drivers/powercap/Makefile
+++ b/drivers/powercap/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_POWERCAP)	+= powercap_sys.o
 obj-$(CONFIG_INTEL_RAPL) += intel_rapl.o
+obj-$(CONFIG_IDLE_INJECTION) += idle_injection.o
diff --git a/drivers/powercap/idle_injection.c b/drivers/powercap/idle_injection.c
new file mode 100644
index 0000000..1ff1a46
--- /dev/null
+++ b/drivers/powercap/idle_injection.c
@@ -0,0 +1,414 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2018 Linaro Limited
+ *
+ * Author: Daniel Lezcano <daniel.lezcano@linaro.org>
+ *
+ * The idle injection framework proposes a way to force a cpu to enter
+ * an idle state during a specified amount of time for a specified
+ * period.
+ *
+ * It relies on the smpboot kthreads which handles, via its main loop,
+ * the common code for hotplugging and [un]parking.
+ *
+ * At init time, all the kthreads are created and parked.
+ *
+ * A cpumask is specified as parameter for the idle injection
+ * registering function. The kthreads will be synchronized regarding
+ * this cpumask.
+ *
+ * The idle + run duration is specified via the helpers and then the
+ * idle injection can be started at this point.
+ *
+ * A kthread will call play_idle() with the specified idle duration
+ * from above.
+ *
+ * A timer is set after waking up all the tasks, to the next idle
+ * injection cycle.
+ *
+ * The task handling the timer interrupt will wakeup all the kthreads
+ * belonging to the cpumask.
+ */
+#define pr_fmt(fmt) "ii_dev: " fmt
+
+#include <linux/cpu.h>
+#include <linux/freezer.h>
+#include <linux/hrtimer.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/smpboot.h>
+
+#include <uapi/linux/sched/types.h>
+
+/**
+ * struct idle_injection_thread - task on/off switch structure
+ * @tsk: a pointer to a task_struct injecting the idle cycles
+ * @should_run: a integer used as a boolean by the smpboot kthread API
+ */
+struct idle_injection_thread {
+	struct task_struct *tsk;
+	int should_run;
+};
+
+/**
+ * struct idle_injection_device - data for the idle injection
+ * @cpumask: a cpumask containing the list of CPUs managed by the device
+ * @timer: a hrtimer giving the tempo for the idle injection
+ * @count: an atomic to keep track of the last task exiting the idle cycle
+ * @idle_duration_ms: an unsigned int specifying the idle duration
+ * @run_duration_ms: an unsigned int specifying the running duration
+ */
+struct idle_injection_device {
+	cpumask_var_t cpumask;
+	struct hrtimer timer;
+	struct completion stop_complete;
+	unsigned int idle_duration_ms;
+	unsigned int run_duration_ms;
+	atomic_t count;
+};
+
+static DEFINE_PER_CPU(struct idle_injection_thread, idle_injection_thread);
+static DEFINE_PER_CPU(struct idle_injection_device *, idle_injection_device);
+
+/**
+ * __idle_injection_wakeup - Wake up all idle injection threads
+ * @ii_dev: the idle injection device
+ *
+ * Every idle injection task belonging to the idle injection device
+ * and running on an online CPU will be wake up by this call.
+ */
+static void __idle_injection_wakeup(struct idle_injection_device *ii_dev)
+{
+	struct idle_injection_thread *iit;
+	struct cpumask tmp;
+	unsigned int cpu;
+
+	cpumask_and(&tmp, ii_dev->cpumask, cpu_online_mask);
+
+	atomic_set(&ii_dev->count, cpumask_weight(&tmp));
+
+	for_each_cpu(cpu, &tmp) {
+		iit = per_cpu_ptr(&idle_injection_thread, cpu);
+		iit->should_run = 1;
+		wake_up_process(iit->tsk);
+	}
+}
+
+/**
+ * idle_injection_wakeup - Wake up all idle injection threads
+ * @ii_dev: the idle injection device
+ *
+ * This function wakes up all the idle injection tasks belonging to
+ * @ii_dev by invoking __idle_injection_wakeup() but with the cpu
+ * hoplug disabled.
+ */
+static void idle_injection_wakeup(struct idle_injection_device *ii_dev)
+{
+	get_online_cpus();
+	__idle_injection_wakeup(ii_dev);
+	put_online_cpus();
+}
+
+/**
+ * idle_injection_wakeup_fn - idle injection timer callback
+ * @timer: a hrtimer structure
+ *
+ * This function is called when the idle injection timer expires which
+ * will wake up the idle injection tasks and these ones, in turn, play
+ * idle a specified amount of time.
+ *
+ * Return: HRTIMER_NORESTART.
+ */
+static enum hrtimer_restart idle_injection_wakeup_fn(struct hrtimer *timer)
+{
+	struct idle_injection_device *ii_dev =
+		container_of(timer, struct idle_injection_device, timer);
+
+	__idle_injection_wakeup(ii_dev);
+
+	return HRTIMER_NORESTART;
+}
+
+/**
+ * idle_injection_last_man - operations by the last task
+ * @ii_dev: a pointer to an idle_injection_device structure
+ *
+ * This function groups the operations done by the last idle injection
+ * task. It can be called from the idle injection callback as well as
+ * from the park callback when the thread is parked at hotplug time.
+ */
+static void idle_injection_last_man(struct idle_injection_device *ii_dev)
+{
+	unsigned int run_duration_ms;
+
+	run_duration_ms = READ_ONCE(ii_dev->run_duration_ms);
+	if (run_duration_ms) {
+		hrtimer_start(&ii_dev->timer, ms_to_ktime(run_duration_ms),
+			      HRTIMER_MODE_REL_PINNED);
+		return;
+	}
+
+	complete(&ii_dev->stop_complete);
+}
+
+/**
+ * idle_injection_fn - idle injection routine
+ * @cpu: the CPU number the tasks belongs to
+ *
+ * The idle injection routine will stay idle the specified amount of
+ * time
+ */
+static void idle_injection_fn(unsigned int cpu)
+{
+	struct idle_injection_device *ii_dev;
+	struct idle_injection_thread *iit;
+	unsigned int idle_duration_ms;
+
+	ii_dev = per_cpu(idle_injection_device, cpu);
+	iit = per_cpu_ptr(&idle_injection_thread, cpu);
+
+	/*
+	 * Boolean used by the smpboot main loop and used as a
+	 * flip-flop in this function
+	 */
+	iit->should_run = 0;
+
+	idle_duration_ms = READ_ONCE(ii_dev->idle_duration_ms);
+	if (idle_duration_ms)
+		play_idle(idle_duration_ms);
+
+	if (atomic_dec_and_test(&ii_dev->count))
+		idle_injection_last_man(ii_dev);
+}
+
+/**
+ * idle_injection_set_duration - idle and run duration helper
+ * @run_duration_ms: an unsigned int giving the running time in milliseconds
+ * @idle_duration_ms: an unsigned int giving the idle time in milliseconds
+ */
+void idle_injection_set_duration(struct idle_injection_device *ii_dev,
+				 unsigned int run_duration_ms,
+				 unsigned int idle_duration_ms)
+{
+	WRITE_ONCE(ii_dev->run_duration_ms, run_duration_ms);
+	WRITE_ONCE(ii_dev->idle_duration_ms, idle_duration_ms);
+}
+
+/**
+ * idle_injection_get_duration - idle and run duration helper
+ * @run_duration_ms: a pointer to an unsigned int to store the running time
+ * @idle_duration_ms: a pointer to an unsigned int to store the idle time
+ */
+void idle_injection_get_duration(struct idle_injection_device *ii_dev,
+				 unsigned int *run_duration_ms,
+				 unsigned int *idle_duration_ms)
+{
+	*run_duration_ms = READ_ONCE(ii_dev->run_duration_ms);
+	*idle_duration_ms = READ_ONCE(ii_dev->idle_duration_ms);
+}
+
+/**
+ * idle_injection_start - starts the idle injections
+ * @ii_dev: a pointer to an idle_injection_device structure
+ *
+ * The function starts the idle injection cycles by first waking up
+ * all the tasks the ii_dev is attached to and let them handle the
+ * idle-run periods.
+ *
+ * Return: -EINVAL if the idle or the running durations are not set.
+ */
+int idle_injection_start(struct idle_injection_device *ii_dev)
+{
+	if (!READ_ONCE(ii_dev->idle_duration_ms))
+		return -EINVAL;
+
+	if (!READ_ONCE(ii_dev->run_duration_ms))
+		return -EINVAL;
+
+	pr_debug("Starting injecting idle cycles on CPUs '%*pbl'\n",
+		 cpumask_pr_args(ii_dev->cpumask));
+
+	idle_injection_wakeup(ii_dev);
+
+	return 0;
+}
+
+/**
+ * idle_injection_stop - stops the idle injections
+ * @ii_dev: a pointer to an idle injection_device structure
+ *
+ * The function stops the idle injection by resetting the idle and
+ * running durations and wait for the threads to complete. If we are
+ * in the process of injecting an idle cycle, then this will wait the
+ * end of the cycle.
+ */
+void idle_injection_stop(struct idle_injection_device *ii_dev)
+{
+	pr_debug("Stopping injecting idle cycles on CPUs '%*pbl'\n",
+		 cpumask_pr_args(ii_dev->cpumask));
+
+	idle_injection_set_duration(ii_dev, 0, 0);
+
+	wait_for_completion_interruptible(&ii_dev->stop_complete);
+}
+
+/**
+ * idle_injection_setup - initialize the current task as a RT task
+ * @cpu: the CPU number where the kthread is running on (not used)
+ *
+ * Called one time, this function is in charge of setting the task
+ * scheduler parameters.
+ */
+static void idle_injection_setup(unsigned int cpu)
+{
+	struct sched_param param = { .sched_priority = MAX_USER_RT_PRIO / 2 };
+
+	set_freezable();
+
+	sched_setscheduler(current, SCHED_FIFO, &param);
+}
+
+/**
+ * idle_injection_park - move out of the idle injection threads pool
+ * @cpu: the CPU number where the kthread is running on
+ *
+ * Catch the situation where the CPU is hotplugged while we were about
+ * to inject an idle cycle.
+ */
+static void idle_injection_park(unsigned int cpu)
+{
+	struct idle_injection_device *ii_dev;
+	struct idle_injection_thread *iit;
+
+	ii_dev = per_cpu(idle_injection_device, cpu);
+	iit = per_cpu_ptr(&idle_injection_thread, cpu);
+
+	/*
+	 * If we are supposed to run an idle cycle but obviously we
+	 * can't because we are in the process of parking, so bail out
+	 * from the pool of idle injection tasks, do nothing otherwise.
+	 */
+	if (!iit->should_run)
+		return;
+
+	iit->should_run = 0;
+
+	if (atomic_dec_and_test(&ii_dev->count))
+		idle_injection_last_man(ii_dev);
+}
+
+/**
+ * idle_injection_should_run - function helper for the smpboot API
+ * @cpu: the CPU number where the kthread is running on
+ *
+ * Return: a boolean telling if the thread can run.
+ */
+static int idle_injection_should_run(unsigned int cpu)
+{
+	struct idle_injection_thread *iit =
+		per_cpu_ptr(&idle_injection_thread, cpu);
+
+	return iit->should_run;
+}
+
+static struct idle_injection_device *ii_dev_alloc(void)
+{
+	struct idle_injection_device *ii_dev;
+
+	ii_dev = kzalloc(sizeof(*ii_dev), GFP_KERNEL);
+	if (!ii_dev)
+		return NULL;
+
+	if (!alloc_cpumask_var(&ii_dev->cpumask, GFP_KERNEL)) {
+		kfree(ii_dev);
+		return NULL;
+	}
+
+	return ii_dev;
+}
+
+static void ii_dev_free(struct idle_injection_device *ii_dev)
+{
+	free_cpumask_var(ii_dev->cpumask);
+	kfree(ii_dev);
+}
+
+/**
+ * idle_injection_register - idle injection init routine
+ * @cpumask: the list of CPUs managed by the idle injection device
+ *
+ * This is the initialization function in charge of creating the
+ * initializing of the timer and allocate the structures. It does not
+ * starts the idle injection cycles.
+ *
+ * Return: NULL if an allocation fails.
+ */
+struct idle_injection_device *idle_injection_register(struct cpumask *cpumask)
+{
+	struct idle_injection_device *ii_dev;
+	int cpu;
+
+	ii_dev = ii_dev_alloc();
+	if (!ii_dev)
+		return NULL;
+
+	cpumask_copy(ii_dev->cpumask, cpumask);
+	hrtimer_init(&ii_dev->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	ii_dev->timer.function = idle_injection_wakeup_fn;
+	init_completion(&ii_dev->stop_complete);
+
+	for_each_cpu(cpu, ii_dev->cpumask) {
+
+		if (per_cpu(idle_injection_device, cpu)) {
+			pr_err("cpu%d is already registered\n", cpu);
+			goto out_rollback_per_cpu;
+		}
+
+		per_cpu(idle_injection_device, cpu) = ii_dev;
+	}
+
+	return ii_dev;
+
+out_rollback_per_cpu:
+	for_each_cpu(cpu, ii_dev->cpumask)
+		per_cpu(idle_injection_device, cpu) = NULL;
+
+	ii_dev_free(ii_dev);
+
+	return NULL;
+}
+
+/**
+ * idle_injection_unregister - Unregister the idle injection device
+ * @ii_dev: a pointer to an idle injection device
+ *
+ * The function is in charge of stopping the idle injections,
+ * unregister the kthreads and free the allocated memory in the
+ * register function.
+ */
+void idle_injection_unregister(struct idle_injection_device *ii_dev)
+{
+	int cpu;
+
+	idle_injection_stop(ii_dev);
+
+	for_each_cpu(cpu, ii_dev->cpumask)
+		per_cpu(idle_injection_device, cpu) = NULL;
+
+	ii_dev_free(ii_dev);
+}
+
+static struct smp_hotplug_thread idle_injection_threads = {
+	.store = &idle_injection_thread.tsk,
+	.setup = idle_injection_setup,
+	.park = idle_injection_park,
+	.thread_fn = idle_injection_fn,
+	.thread_comm = "idle_inject/%u",
+	.thread_should_run = idle_injection_should_run,
+};
+
+static int __init idle_injection_init(void)
+{
+	return smpboot_register_percpu_thread(&idle_injection_threads);
+}
+early_initcall(idle_injection_init);
diff --git a/include/linux/idle_injection.h b/include/linux/idle_injection.h
new file mode 100644
index 0000000..ffd20d7
--- /dev/null
+++ b/include/linux/idle_injection.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2018 Linaro Ltd
+ *
+ * Author: Daniel Lezcano <daniel.lezcano@linaro.org>
+ *
+ */
+#ifndef __IDLE_INJECTION_H__
+#define __IDLE_INJECTION_H__
+
+/* private idle injection device structure */
+struct idle_injection_device;
+
+struct idle_injection_device *idle_injection_register(struct cpumask *cpumask);
+
+void idle_injection_unregister(struct idle_injection_device *ii_dev);
+
+int idle_injection_start(struct idle_injection_device *ii_dev);
+
+void idle_injection_stop(struct idle_injection_device *ii_dev);
+
+void idle_injection_set_duration(struct idle_injection_device *ii_dev,
+				 unsigned int run_duration_ms,
+				 unsigned int idle_duration_ms);
+
+void idle_injection_get_duration(struct idle_injection_device *ii_dev,
+				 unsigned int *run_duration_ms,
+				 unsigned int *idle_duration_ms);
+#endif /* __IDLE_INJECTION_H__ */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 12:00 [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework Daniel Lezcano
@ 2018-06-12 12:30 ` Peter Zijlstra
  2018-06-12 12:44   ` Daniel Lezcano
  2018-06-12 12:59 ` Peter Zijlstra
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2018-06-12 12:30 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: viresh.kumar, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On Tue, Jun 12, 2018 at 02:00:11PM +0200, Daniel Lezcano wrote:
> +static void __idle_injection_wakeup(struct idle_injection_device *ii_dev)
> +{
> +	struct idle_injection_thread *iit;
> +	struct cpumask tmp;
> +	unsigned int cpu;
> +
> +	cpumask_and(&tmp, ii_dev->cpumask, cpu_online_mask);

You should not be having a cpumask on the stack. Those things can be
ginormous.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 12:30 ` Peter Zijlstra
@ 2018-06-12 12:44   ` Daniel Lezcano
  2018-06-12 12:52     ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel Lezcano @ 2018-06-12 12:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: viresh.kumar, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On 12/06/2018 14:30, Peter Zijlstra wrote:
> On Tue, Jun 12, 2018 at 02:00:11PM +0200, Daniel Lezcano wrote:
>> +static void __idle_injection_wakeup(struct idle_injection_device *ii_dev)
>> +{
>> +	struct idle_injection_thread *iit;
>> +	struct cpumask tmp;
>> +	unsigned int cpu;
>> +
>> +	cpumask_and(&tmp, ii_dev->cpumask, cpu_online_mask);
> 
> You should not be having a cpumask on the stack. Those things can be
> ginormous.

Ok, the kernel code uses of cpumask_t on the stack when dealing with
cpumask_and. I assume it is also not recommended.

What would be the best practice ? Allocate a per cpumask at init time as
a temporary mask to work with ?




-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 12:44   ` Daniel Lezcano
@ 2018-06-12 12:52     ` Peter Zijlstra
  2018-06-12 13:02       ` Daniel Lezcano
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2018-06-12 12:52 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: viresh.kumar, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On Tue, Jun 12, 2018 at 02:44:29PM +0200, Daniel Lezcano wrote:
> On 12/06/2018 14:30, Peter Zijlstra wrote:
> > On Tue, Jun 12, 2018 at 02:00:11PM +0200, Daniel Lezcano wrote:
> >> +static void __idle_injection_wakeup(struct idle_injection_device *ii_dev)
> >> +{
> >> +	struct idle_injection_thread *iit;
> >> +	struct cpumask tmp;
> >> +	unsigned int cpu;
> >> +
> >> +	cpumask_and(&tmp, ii_dev->cpumask, cpu_online_mask);
> > 
> > You should not be having a cpumask on the stack. Those things can be
> > ginormous.
> 
> Ok, the kernel code uses of cpumask_t on the stack when dealing with
> cpumask_and. I assume it is also not recommended.

Yes, that should all get fixed. It's mostly legacy code I suppose. It's
been at least 10 years I think since we merged the whole
CPUMASK_OFFSTACK stuff.

> What would be the best practice ? Allocate a per cpumask at init time as
> a temporary mask to work with ?

In this case, you can do:

+       for_each_cpu_and(cpu, &ii_dev->cpumask, cpu_online_mask) {
+               iit = per_cpu_ptr(&idle_injection_thread, cpu);
+               iit->should_run = 1;
+               wake_up_process(iit->tsk);
+       }

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 12:00 [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework Daniel Lezcano
  2018-06-12 12:30 ` Peter Zijlstra
@ 2018-06-12 12:59 ` Peter Zijlstra
  2018-06-13  8:24   ` Viresh Kumar
  2018-06-12 14:23 ` Peter Zijlstra
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2018-06-12 12:59 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: viresh.kumar, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On Tue, Jun 12, 2018 at 02:00:11PM +0200, Daniel Lezcano wrote:
> +struct idle_injection_device {

remove this:
> +	cpumask_var_t cpumask;

> +	struct hrtimer timer;
> +	struct completion stop_complete;
> +	unsigned int idle_duration_ms;
> +	unsigned int run_duration_ms;
> +	atomic_t count;

add:
	unsigned long cpumask[0];
> +};


> +static struct idle_injection_device *ii_dev_alloc(void)
> +{
> +	struct idle_injection_device *ii_dev;
> +
> +	ii_dev = kzalloc(sizeof(*ii_dev), GFP_KERNEL);

use:

	sizeof(*ii_dev) + cpumask_size()

> +	if (!ii_dev)
> +		return NULL;
> +

delete:

> +	if (!alloc_cpumask_var(&ii_dev->cpumask, GFP_KERNEL)) {
> +		kfree(ii_dev);
> +		return NULL;
> +	}
> +
> +	return ii_dev;
> +}

And use:

	to_cpumask(ii_dev->cpumask)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 12:52     ` Peter Zijlstra
@ 2018-06-12 13:02       ` Daniel Lezcano
  2018-06-12 14:06         ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel Lezcano @ 2018-06-12 13:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: viresh.kumar, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On 12/06/2018 14:52, Peter Zijlstra wrote:
> On Tue, Jun 12, 2018 at 02:44:29PM +0200, Daniel Lezcano wrote:
>> On 12/06/2018 14:30, Peter Zijlstra wrote:
>>> On Tue, Jun 12, 2018 at 02:00:11PM +0200, Daniel Lezcano wrote:
>>>> +static void __idle_injection_wakeup(struct idle_injection_device *ii_dev)
>>>> +{
>>>> +	struct idle_injection_thread *iit;
>>>> +	struct cpumask tmp;
>>>> +	unsigned int cpu;
>>>> +
>>>> +	cpumask_and(&tmp, ii_dev->cpumask, cpu_online_mask);
>>>
>>> You should not be having a cpumask on the stack. Those things can be
>>> ginormous.
>>
>> Ok, the kernel code uses of cpumask_t on the stack when dealing with
>> cpumask_and. I assume it is also not recommended.
> 
> Yes, that should all get fixed. It's mostly legacy code I suppose. It's
> been at least 10 years I think since we merged the whole
> CPUMASK_OFFSTACK stuff.
> 
>> What would be the best practice ? Allocate a per cpumask at init time as
>> a temporary mask to work with ?
> 
> In this case, you can do:

That is what we had before but we change the code to set the count
before waking up the task, so compute the cpumask_weight of the
resulting AND right before this loop.

> +       for_each_cpu_and(cpu, &ii_dev->cpumask, cpu_online_mask) {
> +               iit = per_cpu_ptr(&idle_injection_thread, cpu);
> +               iit->should_run = 1;
> +               wake_up_process(iit->tsk);
> +       }
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 13:02       ` Daniel Lezcano
@ 2018-06-12 14:06         ` Peter Zijlstra
  2018-06-12 14:37           ` Daniel Lezcano
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2018-06-12 14:06 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: viresh.kumar, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On Tue, Jun 12, 2018 at 03:02:14PM +0200, Daniel Lezcano wrote:
> On 12/06/2018 14:52, Peter Zijlstra wrote:
> > In this case, you can do:
> 
> That is what we had before but we change the code to set the count
> before waking up the task, so compute the cpumask_weight of the
> resulting AND right before this loop.
> 
> > +       for_each_cpu_and(cpu, &ii_dev->cpumask, cpu_online_mask) {
> > +               iit = per_cpu_ptr(&idle_injection_thread, cpu);
> > +               iit->should_run = 1;
> > +               wake_up_process(iit->tsk);
> > +       }


Ah, I see, but since you do:

	if (atomic_dec_and_test())
	  last_man()

where that last_man() thing will start a timer, there is no real problem
with doing atomic_inc() with before wake_up_process().

Yes, it allows doing last_man, too often, but repeated hrtimer_start()
will DTRT and reprogram the timer.

Also, last_man() uses @run_duration, but the way I read it, the timer is
for waking things up again, this means it is in fact the sleep duration,
no?

Furthermore, should you not be using hrtimer_forward(&timer,
idle_duration + run_duration) instead? AFAICT the current scheme is
prone to drifting.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 12:00 [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework Daniel Lezcano
  2018-06-12 12:30 ` Peter Zijlstra
  2018-06-12 12:59 ` Peter Zijlstra
@ 2018-06-12 14:23 ` Peter Zijlstra
  2018-06-12 14:45   ` Daniel Lezcano
  2018-06-12 14:26 ` Peter Zijlstra
  2018-06-13 15:54 ` Pandruvada, Srinivas
  4 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2018-06-12 14:23 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: viresh.kumar, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On Tue, Jun 12, 2018 at 02:00:11PM +0200, Daniel Lezcano wrote:
> +static void idle_injection_last_man(struct idle_injection_device *ii_dev)
> +{
> +	unsigned int run_duration_ms;
> +
> +	run_duration_ms = READ_ONCE(ii_dev->run_duration_ms);
> +	if (run_duration_ms) {
> +		hrtimer_start(&ii_dev->timer, ms_to_ktime(run_duration_ms),
> +			      HRTIMER_MODE_REL_PINNED);

What's the point of PINNED here? AFAICT this gets called from a
different CPU every time.

> +		return;
> +	}
> +
> +	complete(&ii_dev->stop_complete);
> +}

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 12:00 [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework Daniel Lezcano
                   ` (2 preceding siblings ...)
  2018-06-12 14:23 ` Peter Zijlstra
@ 2018-06-12 14:26 ` Peter Zijlstra
  2018-06-12 14:46   ` Daniel Lezcano
  2018-06-13 15:54 ` Pandruvada, Srinivas
  4 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2018-06-12 14:26 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: viresh.kumar, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On Tue, Jun 12, 2018 at 02:00:11PM +0200, Daniel Lezcano wrote:
> +void idle_injection_stop(struct idle_injection_device *ii_dev)
> +{
> +	pr_debug("Stopping injecting idle cycles on CPUs '%*pbl'\n",
> +		 cpumask_pr_args(ii_dev->cpumask));
> +
> +	idle_injection_set_duration(ii_dev, 0, 0);
> +
> +	wait_for_completion_interruptible(&ii_dev->stop_complete);
> +}

interruptible?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 14:06         ` Peter Zijlstra
@ 2018-06-12 14:37           ` Daniel Lezcano
  2018-06-12 15:58             ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel Lezcano @ 2018-06-12 14:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: viresh.kumar, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On 12/06/2018 16:06, Peter Zijlstra wrote:
> On Tue, Jun 12, 2018 at 03:02:14PM +0200, Daniel Lezcano wrote:
>> On 12/06/2018 14:52, Peter Zijlstra wrote:
>>> In this case, you can do:
>>
>> That is what we had before but we change the code to set the count
>> before waking up the task, so compute the cpumask_weight of the
>> resulting AND right before this loop.
>>
>>> +       for_each_cpu_and(cpu, &ii_dev->cpumask, cpu_online_mask) {
>>> +               iit = per_cpu_ptr(&idle_injection_thread, cpu);
>>> +               iit->should_run = 1;
>>> +               wake_up_process(iit->tsk);
>>> +       }
> 
> 
> Ah, I see, but since you do:
> 
> 	if (atomic_dec_and_test())
> 	  last_man()
> 
> where that last_man() thing will start a timer, there is no real problem
> with doing atomic_inc() with before wake_up_process().

Viresh was worried about the scenario:

https://lkml.org/lkml/2018/6/5/276


> Yes, it allows doing last_man, too often, but repeated hrtimer_start()
> will DTRT and reprogram the timer.
> 
> Also, last_man() uses @run_duration, but the way I read it, the timer is
> for waking things up again, this means it is in fact the sleep duration,
> no?

No, it is the next idle injection deadline, meanwhile we let the system
continue running.

The sleep duration is managed by another timer in play_idle().

> Furthermore, should you not be using hrtimer_forward(&timer,
> idle_duration + run_duration) instead? AFAICT the current scheme is
> prone to drifting.

(I assume you meant setting the timer in the wakeup task function).

Yes, drifting is not an issue if that happens. This scheme is simpler
and safer than setting the timer ahead before waking up the tasks with
the risk it expires before all the tasks ended their idle cycles.



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 14:23 ` Peter Zijlstra
@ 2018-06-12 14:45   ` Daniel Lezcano
  0 siblings, 0 replies; 27+ messages in thread
From: Daniel Lezcano @ 2018-06-12 14:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: viresh.kumar, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On 12/06/2018 16:23, Peter Zijlstra wrote:
> On Tue, Jun 12, 2018 at 02:00:11PM +0200, Daniel Lezcano wrote:
>> +static void idle_injection_last_man(struct idle_injection_device *ii_dev)
>> +{
>> +	unsigned int run_duration_ms;
>> +
>> +	run_duration_ms = READ_ONCE(ii_dev->run_duration_ms);
>> +	if (run_duration_ms) {
>> +		hrtimer_start(&ii_dev->timer, ms_to_ktime(run_duration_ms),
>> +			      HRTIMER_MODE_REL_PINNED);
> 
> What's the point of PINNED here? AFAICT this gets called from a
> different CPU every time.

Right, I will remove it.

-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 14:26 ` Peter Zijlstra
@ 2018-06-12 14:46   ` Daniel Lezcano
  0 siblings, 0 replies; 27+ messages in thread
From: Daniel Lezcano @ 2018-06-12 14:46 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: viresh.kumar, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On 12/06/2018 16:26, Peter Zijlstra wrote:
> On Tue, Jun 12, 2018 at 02:00:11PM +0200, Daniel Lezcano wrote:
>> +void idle_injection_stop(struct idle_injection_device *ii_dev)
>> +{
>> +	pr_debug("Stopping injecting idle cycles on CPUs '%*pbl'\n",
>> +		 cpumask_pr_args(ii_dev->cpumask));
>> +
>> +	idle_injection_set_duration(ii_dev, 0, 0);
>> +
>> +	wait_for_completion_interruptible(&ii_dev->stop_complete);
>> +}
> 
> interruptible?

Right, can be removed.


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 14:37           ` Daniel Lezcano
@ 2018-06-12 15:58             ` Peter Zijlstra
  2018-06-12 17:02               ` Daniel Lezcano
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2018-06-12 15:58 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: viresh.kumar, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On Tue, Jun 12, 2018 at 04:37:17PM +0200, Daniel Lezcano wrote:
> On 12/06/2018 16:06, Peter Zijlstra wrote:
> > On Tue, Jun 12, 2018 at 03:02:14PM +0200, Daniel Lezcano wrote:
> >> On 12/06/2018 14:52, Peter Zijlstra wrote:
> >>> In this case, you can do:
> >>
> >> That is what we had before but we change the code to set the count
> >> before waking up the task, so compute the cpumask_weight of the
> >> resulting AND right before this loop.
> >>
> >>> +       for_each_cpu_and(cpu, &ii_dev->cpumask, cpu_online_mask) {
> >>> +               iit = per_cpu_ptr(&idle_injection_thread, cpu);
> >>> +               iit->should_run = 1;
> >>> +               wake_up_process(iit->tsk);
> >>> +       }
> > 
> > 
> > Ah, I see, but since you do:
> > 
> > 	if (atomic_dec_and_test())
> > 	  last_man()
> > 
> > where that last_man() thing will start a timer, there is no real problem
> > with doing atomic_inc() with before wake_up_process().
> 
> Viresh was worried about the scenario:
> 
> https://lkml.org/lkml/2018/6/5/276

Ah, but I think you have more races, for instance look at wakeup vs
park, what if wakeup sets should_run after you've just checked it?

Then you have an inc without a dec.

> > Also, last_man() uses @run_duration, but the way I read it, the timer is
> > for waking things up again, this means it is in fact the sleep duration,
> > no?
> 
> No, it is the next idle injection deadline, meanwhile we let the system
> continue running.
> 
> The sleep duration is managed by another timer in play_idle().

No, that's the idle duration. Maybe avoid the issue entire by having a
{period,idle} tuple, where your old run := period - idle.

> > Furthermore, should you not be using hrtimer_forward(&timer,
> > idle_duration + run_duration) instead? AFAICT the current scheme is
> > prone to drifting.
> 
> (I assume you meant setting the timer in the wakeup task function).
> 
> Yes, drifting is not an issue if that happens. This scheme is simpler
> and safer than setting the timer ahead before waking up the tasks with
> the risk it expires before all the tasks ended their idle cycles.

sloppy though..

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 15:58             ` Peter Zijlstra
@ 2018-06-12 17:02               ` Daniel Lezcano
  2018-06-12 17:35                 ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel Lezcano @ 2018-06-12 17:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: viresh.kumar, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On 12/06/2018 17:58, Peter Zijlstra wrote:
> On Tue, Jun 12, 2018 at 04:37:17PM +0200, Daniel Lezcano wrote:
>> On 12/06/2018 16:06, Peter Zijlstra wrote:
>>> On Tue, Jun 12, 2018 at 03:02:14PM +0200, Daniel Lezcano wrote:
>>>> On 12/06/2018 14:52, Peter Zijlstra wrote:
>>>>> In this case, you can do:
>>>>
>>>> That is what we had before but we change the code to set the count
>>>> before waking up the task, so compute the cpumask_weight of the
>>>> resulting AND right before this loop.
>>>>
>>>>> +       for_each_cpu_and(cpu, &ii_dev->cpumask, cpu_online_mask) {
>>>>> +               iit = per_cpu_ptr(&idle_injection_thread, cpu);
>>>>> +               iit->should_run = 1;
>>>>> +               wake_up_process(iit->tsk);
>>>>> +       }
>>>
>>>
>>> Ah, I see, but since you do:
>>>
>>> 	if (atomic_dec_and_test())
>>> 	  last_man()
>>>
>>> where that last_man() thing will start a timer, there is no real problem
>>> with doing atomic_inc() with before wake_up_process().
>>
>> Viresh was worried about the scenario:
>>
>> https://lkml.org/lkml/2018/6/5/276
> 
> Ah, but I think you have more races, for instance look at wakeup vs
> park, what if wakeup sets should_run after you've just checked it?
> 
> Then you have an inc without a dec.

Mmh, it is unclear for me if the park() vs wakeup() can happen at the
same time.

If the park() function is called, that means the hotplug is allowed.

If the hotplug is allowed, we can modify the online mask.

What happens with the online mask when we are processing it in an
interrupt context ?

>>> Also, last_man() uses @run_duration, but the way I read it, the timer is
>>> for waking things up again, this means it is in fact the sleep duration,
>>> no?
>>
>> No, it is the next idle injection deadline, meanwhile we let the system
>> continue running.
>>
>> The sleep duration is managed by another timer in play_idle().
> 
> No, that's the idle duration. 

Ah, ok. I misunderstood what you meant by 'sleep' duration.

> Maybe avoid the issue entire by having a
> {period,idle} tuple, where your old run := period - idle.

Can you elaborate ? I don't get it.

>>> Furthermore, should you not be using hrtimer_forward(&timer,
>>> idle_duration + run_duration) instead? AFAICT the current scheme is
>>> prone to drifting.
>>
>> (I assume you meant setting the timer in the wakeup task function).
>>
>> Yes, drifting is not an issue if that happens. This scheme is simpler
>> and safer than setting the timer ahead before waking up the tasks with
>> the risk it expires before all the tasks ended their idle cycles.
> 
> sloppy though..

Ok, do you prefer to see the timer set in the wakeup function and thus
having a periodic tick for the idle injection ?


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 17:02               ` Daniel Lezcano
@ 2018-06-12 17:35                 ` Peter Zijlstra
  2018-06-13  8:55                   ` Viresh Kumar
  2018-06-15  8:13                   ` Daniel Lezcano
  0 siblings, 2 replies; 27+ messages in thread
From: Peter Zijlstra @ 2018-06-12 17:35 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: viresh.kumar, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On Tue, Jun 12, 2018 at 07:02:57PM +0200, Daniel Lezcano wrote:
> Mmh, it is unclear for me if the park() vs wakeup() can happen at the
> same time.
> 
> If the park() function is called, that means the hotplug is allowed.

No, it means we're inside hot-un-plug, but that doesn't stop the hrtimer
from firing.

> If the hotplug is allowed, we can modify the online mask.
> 
> What happens with the online mask when we are processing it in an
> interrupt context ?

RCU-like, if you observe a CPU in the online mask, it will stay
available, but the bit might get cleared.

> > Maybe avoid the issue entire by having a
> > {period,idle} tuple, where your old run := period - idle.
> 
> Can you elaborate ? I don't get it.

Have a period parameter that specifies the interval in which you have
one injected idle, and specify for how long you want to inject idle;
then obviously idle < period.

> >>> Furthermore, should you not be using hrtimer_forward(&timer,
> >>> idle_duration + run_duration) instead? AFAICT the current scheme is
> >>> prone to drifting.
> >>
> >> (I assume you meant setting the timer in the wakeup task function).
> >>
> >> Yes, drifting is not an issue if that happens. This scheme is simpler
> >> and safer than setting the timer ahead before waking up the tasks with
> >> the risk it expires before all the tasks ended their idle cycles.
> > 
> > sloppy though..
> 
> Ok, do you prefer to see the timer set in the wakeup function and thus
> having a periodic tick for the idle injection ?

I think having a HRTIMER_RESTART handler that does hrtimer_forward() is
the most sensible. You will end up having to deal with threads not being
ready, but I think that's not a real problem.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 12:59 ` Peter Zijlstra
@ 2018-06-13  8:24   ` Viresh Kumar
  0 siblings, 0 replies; 27+ messages in thread
From: Viresh Kumar @ 2018-06-13  8:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Daniel Lezcano, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On 12-06-18, 14:59, Peter Zijlstra wrote:
> On Tue, Jun 12, 2018 at 02:00:11PM +0200, Daniel Lezcano wrote:
> > +struct idle_injection_device {
> 
> remove this:
> > +	cpumask_var_t cpumask;
> 
> > +	struct hrtimer timer;
> > +	struct completion stop_complete;
> > +	unsigned int idle_duration_ms;
> > +	unsigned int run_duration_ms;
> > +	atomic_t count;
> 
> add:
> 	unsigned long cpumask[0];
> > +};
> 
> 
> > +static struct idle_injection_device *ii_dev_alloc(void)
> > +{
> > +	struct idle_injection_device *ii_dev;
> > +
> > +	ii_dev = kzalloc(sizeof(*ii_dev), GFP_KERNEL);
> 
> use:
> 
> 	sizeof(*ii_dev) + cpumask_size()
> 
> > +	if (!ii_dev)
> > +		return NULL;
> > +
> 
> delete:
> 
> > +	if (!alloc_cpumask_var(&ii_dev->cpumask, GFP_KERNEL)) {
> > +		kfree(ii_dev);
> > +		return NULL;
> > +	}
> > +
> > +	return ii_dev;
> > +}
> 
> And use:
> 
> 	to_cpumask(ii_dev->cpumask)

What's the benefit of these changes? Is it just about not allocating memory
twice or more than that ?

And what could we do in situations where we need two cpumask variables (we have
a case in cpufreq core for that) ?

-- 
viresh

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 17:35                 ` Peter Zijlstra
@ 2018-06-13  8:55                   ` Viresh Kumar
  2018-06-13  9:03                     ` Daniel Lezcano
  2018-06-15  8:13                   ` Daniel Lezcano
  1 sibling, 1 reply; 27+ messages in thread
From: Viresh Kumar @ 2018-06-13  8:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Daniel Lezcano, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On 12-06-18, 19:35, Peter Zijlstra wrote:
> On Tue, Jun 12, 2018 at 07:02:57PM +0200, Daniel Lezcano wrote:
> > Mmh, it is unclear for me if the park() vs wakeup() can happen at the
> > same time.
> > 
> > If the park() function is called, that means the hotplug is allowed.
> 
> No, it means we're inside hot-un-plug, but that doesn't stop the hrtimer
> from firing.
> 
> > If the hotplug is allowed, we can modify the online mask.
> > 
> > What happens with the online mask when we are processing it in an
> > interrupt context ?
> 
> RCU-like, if you observe a CPU in the online mask, it will stay
> available, but the bit might get cleared.
> 
> > > Maybe avoid the issue entire by having a
> > > {period,idle} tuple, where your old run := period - idle.
> > 
> > Can you elaborate ? I don't get it.
> 
> Have a period parameter that specifies the interval in which you have
> one injected idle, and specify for how long you want to inject idle;
> then obviously idle < period.
> 
> > >>> Furthermore, should you not be using hrtimer_forward(&timer,
> > >>> idle_duration + run_duration) instead? AFAICT the current scheme is
> > >>> prone to drifting.
> > >>
> > >> (I assume you meant setting the timer in the wakeup task function).
> > >>
> > >> Yes, drifting is not an issue if that happens. This scheme is simpler
> > >> and safer than setting the timer ahead before waking up the tasks with
> > >> the risk it expires before all the tasks ended their idle cycles.
> > > 
> > > sloppy though..
> > 
> > Ok, do you prefer to see the timer set in the wakeup function and thus
> > having a periodic tick for the idle injection ?
> 
> I think having a HRTIMER_RESTART handler that does hrtimer_forward() is
> the most sensible. You will end up having to deal with threads not being
> ready, but I think that's not a real problem.

Honestly speaking I am not sure if we can fix all the races we have identified
until now, with the current design and the fear of using resources after getting
freed will always be there. Too many corner cases really.

The requirement we have is to make sure no kthread or hrtimer is still in use
when the ii_dev is freed from idle_injection_unregister().

May be the only sane way to make it work is to call
smpboot_register_percpu_thread() from idle_injection_register() and then
unregister those threads from idle_injection_unregister().

-- 
viresh

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-13  8:55                   ` Viresh Kumar
@ 2018-06-13  9:03                     ` Daniel Lezcano
  2018-06-13  9:10                       ` Viresh Kumar
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel Lezcano @ 2018-06-13  9:03 UTC (permalink / raw)
  To: Viresh Kumar, Peter Zijlstra
  Cc: rjw, linux-kernel, linux-pm, Eduardo Valentin, Javi Merino,
	Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On 13/06/2018 10:55, Viresh Kumar wrote:
> On 12-06-18, 19:35, Peter Zijlstra wrote:
>> On Tue, Jun 12, 2018 at 07:02:57PM +0200, Daniel Lezcano wrote:
>>> Mmh, it is unclear for me if the park() vs wakeup() can happen at the
>>> same time.
>>>
>>> If the park() function is called, that means the hotplug is allowed.
>>
>> No, it means we're inside hot-un-plug, but that doesn't stop the hrtimer
>> from firing.
>>
>>> If the hotplug is allowed, we can modify the online mask.
>>>
>>> What happens with the online mask when we are processing it in an
>>> interrupt context ?
>>
>> RCU-like, if you observe a CPU in the online mask, it will stay
>> available, but the bit might get cleared.
>>
>>>> Maybe avoid the issue entire by having a
>>>> {period,idle} tuple, where your old run := period - idle.
>>>
>>> Can you elaborate ? I don't get it.
>>
>> Have a period parameter that specifies the interval in which you have
>> one injected idle, and specify for how long you want to inject idle;
>> then obviously idle < period.
>>
>>>>>> Furthermore, should you not be using hrtimer_forward(&timer,
>>>>>> idle_duration + run_duration) instead? AFAICT the current scheme is
>>>>>> prone to drifting.
>>>>>
>>>>> (I assume you meant setting the timer in the wakeup task function).
>>>>>
>>>>> Yes, drifting is not an issue if that happens. This scheme is simpler
>>>>> and safer than setting the timer ahead before waking up the tasks with
>>>>> the risk it expires before all the tasks ended their idle cycles.
>>>>
>>>> sloppy though..
>>>
>>> Ok, do you prefer to see the timer set in the wakeup function and thus
>>> having a periodic tick for the idle injection ?
>>
>> I think having a HRTIMER_RESTART handler that does hrtimer_forward() is
>> the most sensible. You will end up having to deal with threads not being
>> ready, but I think that's not a real problem.
> 
> Honestly speaking I am not sure if we can fix all the races we have identified
> until now, with the current design and the fear of using resources after getting
> freed will always be there. Too many corner cases really.
> 
> The requirement we have is to make sure no kthread or hrtimer is still in use
> when the ii_dev is freed from idle_injection_unregister().
> 
> May be the only sane way to make it work is to call
> smpboot_register_percpu_thread() from idle_injection_register() and then
> unregister those threads from idle_injection_unregister().

nr_threads(smpboot) <> nr_threads(idleinject)

If we are facing races issues, it is because we are trying to avoid
using locks in the code path. With lock and proper refcounting that
should be solved, AFAICT there are similar races with inodes.

Furthermore, hrtimer_forward approach is probably cleaner.



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-13  9:03                     ` Daniel Lezcano
@ 2018-06-13  9:10                       ` Viresh Kumar
  0 siblings, 0 replies; 27+ messages in thread
From: Viresh Kumar @ 2018-06-13  9:10 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Peter Zijlstra, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On 13-06-18, 11:03, Daniel Lezcano wrote:
> nr_threads(smpboot) <> nr_threads(idleinject)
> 
> If we are facing races issues, it is because we are trying to avoid
> using locks in the code path. With lock and proper refcounting that
> should be solved, AFAICT there are similar races with inodes.

I am not sure if locking can make things better here. We aren't using them as we
don't really need them AFAIU :)

-- 
viresh

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 12:00 [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework Daniel Lezcano
                   ` (3 preceding siblings ...)
  2018-06-12 14:26 ` Peter Zijlstra
@ 2018-06-13 15:54 ` Pandruvada, Srinivas
  2018-06-13 20:00   ` Daniel Lezcano
  4 siblings, 1 reply; 27+ messages in thread
From: Pandruvada, Srinivas @ 2018-06-13 15:54 UTC (permalink / raw)
  To: daniel.lezcano, viresh.kumar, rjw
  Cc: linux-kernel, peterz, Zhang, Rui, edubezval, linux-pm,
	andrea.parri, kevin.wangtao, leo.yan, daniel.thompson,
	javi.merino, vincent.guittot

[-- Attachment #1: Type: text/plain, Size: 19084 bytes --]

On Tue, 2018-06-12 at 14:00 +0200, Daniel Lezcano wrote:
> Initially, the cpu_cooling device for ARM was changed by adding a new
> policy inserting idle cycles. The intel_powerclamp driver does a
> similar action.
> 
> Instead of implementing idle injections privately in the cpu_cooling
> device, move the idle injection code in a dedicated framework and
> give
> the opportunity to other frameworks to make use of it.
> 
> The framework relies on the smpboot kthreads which handles via its
> main loop the common code for hotplugging and [un]parking.
> 
> This code was previously tested with the cpu cooling device and went
> through several iterations. It results now in split code and API
> exported in the header file. It was tested with the cpu cooling
> device
> with success.
> 
May be I have missed, but I don't see any use of powercap sysfs.

We created powercap sys that user space is presented a common interface
for power capping. The RAPL driver was also submitted as cooling device
before, but suggestion was to create this powercap.

So if powercap interface is not enough then may be we should enhance
that.

Thanks,
Srinivas

> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: Eduardo Valentin <edubezval@gmail.com>
> Cc: Javi Merino <javi.merino@kernel.org>
> Cc: Leo Yan <leo.yan@linaro.org>
> Cc: Kevin Wangtao <kevin.wangtao@linaro.org>
> Cc: Vincent Guittot <vincent.guittot@linaro.org>
> Cc: Rui Zhang <rui.zhang@intel.com>
> Cc: Daniel Thompson <daniel.thompson@linaro.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Andrea Parri <andrea.parri@amarulasolutions.com>
> ---
> V6:
>    - Move count to wake up function (Viresh Kumar)
>    - Split atomic/non-atomic context to wake up tasks
>    - Add the park callback to handle unplug/inject race
>    - Replace atomic by READ_ONCE and WRITE_ONCE (Peter Zijlstra)
> 
> V5:
>    - Move init_completion in the init function (Viresh Kumar)
>    - Increment task count in the wakeup function (Viresh Kumar)
>    - Remove rollback at init time (Viresh Kumar)
> 
> V4:
>    - Wait for completion when stopping (Viresh Kumar)
>    - Check init already done and rollback (Viresh Kumar)
> 
> V3:
>    - Fixed typos (Viresh Kumar)
>    - Removed extra blank line (Viresh Kumar)
>    - Added full stop (Viresh Kumar)
>    - Fixed Return kerneldoc format (Viresh Kumar)
>    - Fixed multiple kthreads initialization (Viresh Kumar)
>    - Fixed rollbacking the actions in the unregister function (Viresh
> Kumar)
>    - Make idle injection kthreads name hardcoded
>    - Kthreads are created in the initcall process
> 
>  V2: Fixed checkpatch warnings
> 
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> ---
>  drivers/powercap/Kconfig          |  10 +
>  drivers/powercap/Makefile         |   1 +
>  drivers/powercap/idle_injection.c | 414
> ++++++++++++++++++++++++++++++++++++++
>  include/linux/idle_injection.h    |  29 +++
>  4 files changed, 454 insertions(+)
>  create mode 100644 drivers/powercap/idle_injection.c
>  create mode 100644 include/linux/idle_injection.h
> 
> diff --git a/drivers/powercap/Kconfig b/drivers/powercap/Kconfig
> index 85727ef..a767ef2 100644
> --- a/drivers/powercap/Kconfig
> +++ b/drivers/powercap/Kconfig
> @@ -29,4 +29,14 @@ config INTEL_RAPL
>  	  controller, CPU core (Power Plance 0), graphics uncore
> (Power Plane
>  	  1), etc.
>  
> +config IDLE_INJECTION
> +	bool "Idle injection framework"
> +	depends on CPU_IDLE
> +	default n
> +	help
> +	  This enables support for the idle injection framework. It
> +	  provides a way to force idle periods on a set of specified
> +	  CPUs for power capping. Idle period can be injected
> +	  synchronously on a set of specified CPUs or alternatively
> +	  on a per CPU basis.
>  endif
> diff --git a/drivers/powercap/Makefile b/drivers/powercap/Makefile
> index 0a21ef3..c3bbfee 100644
> --- a/drivers/powercap/Makefile
> +++ b/drivers/powercap/Makefile
> @@ -1,2 +1,3 @@
>  obj-$(CONFIG_POWERCAP)	+= powercap_sys.o
>  obj-$(CONFIG_INTEL_RAPL) += intel_rapl.o
> +obj-$(CONFIG_IDLE_INJECTION) += idle_injection.o
> diff --git a/drivers/powercap/idle_injection.c
> b/drivers/powercap/idle_injection.c
> new file mode 100644
> index 0000000..1ff1a46
> --- /dev/null
> +++ b/drivers/powercap/idle_injection.c
> @@ -0,0 +1,414 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright 2018 Linaro Limited
> + *
> + * Author: Daniel Lezcano <daniel.lezcano@linaro.org>
> + *
> + * The idle injection framework proposes a way to force a cpu to
> enter
> + * an idle state during a specified amount of time for a specified
> + * period.
> + *
> + * It relies on the smpboot kthreads which handles, via its main
> loop,
> + * the common code for hotplugging and [un]parking.
> + *
> + * At init time, all the kthreads are created and parked.
> + *
> + * A cpumask is specified as parameter for the idle injection
> + * registering function. The kthreads will be synchronized regarding
> + * this cpumask.
> + *
> + * The idle + run duration is specified via the helpers and then the
> + * idle injection can be started at this point.
> + *
> + * A kthread will call play_idle() with the specified idle duration
> + * from above.
> + *
> + * A timer is set after waking up all the tasks, to the next idle
> + * injection cycle.
> + *
> + * The task handling the timer interrupt will wakeup all the
> kthreads
> + * belonging to the cpumask.
> + */
> +#define pr_fmt(fmt) "ii_dev: " fmt
> +
> +#include <linux/cpu.h>
> +#include <linux/freezer.h>
> +#include <linux/hrtimer.h>
> +#include <linux/sched.h>
> +#include <linux/slab.h>
> +#include <linux/smpboot.h>
> +
> +#include <uapi/linux/sched/types.h>
> +
> +/**
> + * struct idle_injection_thread - task on/off switch structure
> + * @tsk: a pointer to a task_struct injecting the idle cycles
> + * @should_run: a integer used as a boolean by the smpboot kthread
> API
> + */
> +struct idle_injection_thread {
> +	struct task_struct *tsk;
> +	int should_run;
> +};
> +
> +/**
> + * struct idle_injection_device - data for the idle injection
> + * @cpumask: a cpumask containing the list of CPUs managed by the
> device
> + * @timer: a hrtimer giving the tempo for the idle injection
> + * @count: an atomic to keep track of the last task exiting the idle
> cycle
> + * @idle_duration_ms: an unsigned int specifying the idle duration
> + * @run_duration_ms: an unsigned int specifying the running duration
> + */
> +struct idle_injection_device {
> +	cpumask_var_t cpumask;
> +	struct hrtimer timer;
> +	struct completion stop_complete;
> +	unsigned int idle_duration_ms;
> +	unsigned int run_duration_ms;
> +	atomic_t count;
> +};
> +
> +static DEFINE_PER_CPU(struct idle_injection_thread,
> idle_injection_thread);
> +static DEFINE_PER_CPU(struct idle_injection_device *,
> idle_injection_device);
> +
> +/**
> + * __idle_injection_wakeup - Wake up all idle injection threads
> + * @ii_dev: the idle injection device
> + *
> + * Every idle injection task belonging to the idle injection device
> + * and running on an online CPU will be wake up by this call.
> + */
> +static void __idle_injection_wakeup(struct idle_injection_device
> *ii_dev)
> +{
> +	struct idle_injection_thread *iit;
> +	struct cpumask tmp;
> +	unsigned int cpu;
> +
> +	cpumask_and(&tmp, ii_dev->cpumask, cpu_online_mask);
> +
> +	atomic_set(&ii_dev->count, cpumask_weight(&tmp));
> +
> +	for_each_cpu(cpu, &tmp) {
> +		iit = per_cpu_ptr(&idle_injection_thread, cpu);
> +		iit->should_run = 1;
> +		wake_up_process(iit->tsk);
> +	}
> +}
> +
> +/**
> + * idle_injection_wakeup - Wake up all idle injection threads
> + * @ii_dev: the idle injection device
> + *
> + * This function wakes up all the idle injection tasks belonging to
> + * @ii_dev by invoking __idle_injection_wakeup() but with the cpu
> + * hoplug disabled.
> + */
> +static void idle_injection_wakeup(struct idle_injection_device
> *ii_dev)
> +{
> +	get_online_cpus();
> +	__idle_injection_wakeup(ii_dev);
> +	put_online_cpus();
> +}
> +
> +/**
> + * idle_injection_wakeup_fn - idle injection timer callback
> + * @timer: a hrtimer structure
> + *
> + * This function is called when the idle injection timer expires
> which
> + * will wake up the idle injection tasks and these ones, in turn,
> play
> + * idle a specified amount of time.
> + *
> + * Return: HRTIMER_NORESTART.
> + */
> +static enum hrtimer_restart idle_injection_wakeup_fn(struct hrtimer
> *timer)
> +{
> +	struct idle_injection_device *ii_dev =
> +		container_of(timer, struct idle_injection_device,
> timer);
> +
> +	__idle_injection_wakeup(ii_dev);
> +
> +	return HRTIMER_NORESTART;
> +}
> +
> +/**
> + * idle_injection_last_man - operations by the last task
> + * @ii_dev: a pointer to an idle_injection_device structure
> + *
> + * This function groups the operations done by the last idle
> injection
> + * task. It can be called from the idle injection callback as well
> as
> + * from the park callback when the thread is parked at hotplug time.
> + */
> +static void idle_injection_last_man(struct idle_injection_device
> *ii_dev)
> +{
> +	unsigned int run_duration_ms;
> +
> +	run_duration_ms = READ_ONCE(ii_dev->run_duration_ms);
> +	if (run_duration_ms) {
> +		hrtimer_start(&ii_dev->timer,
> ms_to_ktime(run_duration_ms),
> +			      HRTIMER_MODE_REL_PINNED);
> +		return;
> +	}
> +
> +	complete(&ii_dev->stop_complete);
> +}
> +
> +/**
> + * idle_injection_fn - idle injection routine
> + * @cpu: the CPU number the tasks belongs to
> + *
> + * The idle injection routine will stay idle the specified amount of
> + * time
> + */
> +static void idle_injection_fn(unsigned int cpu)
> +{
> +	struct idle_injection_device *ii_dev;
> +	struct idle_injection_thread *iit;
> +	unsigned int idle_duration_ms;
> +
> +	ii_dev = per_cpu(idle_injection_device, cpu);
> +	iit = per_cpu_ptr(&idle_injection_thread, cpu);
> +
> +	/*
> +	 * Boolean used by the smpboot main loop and used as a
> +	 * flip-flop in this function
> +	 */
> +	iit->should_run = 0;
> +
> +	idle_duration_ms = READ_ONCE(ii_dev->idle_duration_ms);
> +	if (idle_duration_ms)
> +		play_idle(idle_duration_ms);
> +
> +	if (atomic_dec_and_test(&ii_dev->count))
> +		idle_injection_last_man(ii_dev);
> +}
> +
> +/**
> + * idle_injection_set_duration - idle and run duration helper
> + * @run_duration_ms: an unsigned int giving the running time in
> milliseconds
> + * @idle_duration_ms: an unsigned int giving the idle time in
> milliseconds
> + */
> +void idle_injection_set_duration(struct idle_injection_device
> *ii_dev,
> +				 unsigned int run_duration_ms,
> +				 unsigned int idle_duration_ms)
> +{
> +	WRITE_ONCE(ii_dev->run_duration_ms, run_duration_ms);
> +	WRITE_ONCE(ii_dev->idle_duration_ms, idle_duration_ms);
> +}
> +
> +/**
> + * idle_injection_get_duration - idle and run duration helper
> + * @run_duration_ms: a pointer to an unsigned int to store the
> running time
> + * @idle_duration_ms: a pointer to an unsigned int to store the idle
> time
> + */
> +void idle_injection_get_duration(struct idle_injection_device
> *ii_dev,
> +				 unsigned int *run_duration_ms,
> +				 unsigned int *idle_duration_ms)
> +{
> +	*run_duration_ms = READ_ONCE(ii_dev->run_duration_ms);
> +	*idle_duration_ms = READ_ONCE(ii_dev->idle_duration_ms);
> +}
> +
> +/**
> + * idle_injection_start - starts the idle injections
> + * @ii_dev: a pointer to an idle_injection_device structure
> + *
> + * The function starts the idle injection cycles by first waking up
> + * all the tasks the ii_dev is attached to and let them handle the
> + * idle-run periods.
> + *
> + * Return: -EINVAL if the idle or the running durations are not set.
> + */
> +int idle_injection_start(struct idle_injection_device *ii_dev)
> +{
> +	if (!READ_ONCE(ii_dev->idle_duration_ms))
> +		return -EINVAL;
> +
> +	if (!READ_ONCE(ii_dev->run_duration_ms))
> +		return -EINVAL;
> +
> +	pr_debug("Starting injecting idle cycles on CPUs '%*pbl'\n",
> +		 cpumask_pr_args(ii_dev->cpumask));
> +
> +	idle_injection_wakeup(ii_dev);
> +
> +	return 0;
> +}
> +
> +/**
> + * idle_injection_stop - stops the idle injections
> + * @ii_dev: a pointer to an idle injection_device structure
> + *
> + * The function stops the idle injection by resetting the idle and
> + * running durations and wait for the threads to complete. If we are
> + * in the process of injecting an idle cycle, then this will wait
> the
> + * end of the cycle.
> + */
> +void idle_injection_stop(struct idle_injection_device *ii_dev)
> +{
> +	pr_debug("Stopping injecting idle cycles on CPUs '%*pbl'\n",
> +		 cpumask_pr_args(ii_dev->cpumask));
> +
> +	idle_injection_set_duration(ii_dev, 0, 0);
> +
> +	wait_for_completion_interruptible(&ii_dev->stop_complete);
> +}
> +
> +/**
> + * idle_injection_setup - initialize the current task as a RT task
> + * @cpu: the CPU number where the kthread is running on (not used)
> + *
> + * Called one time, this function is in charge of setting the task
> + * scheduler parameters.
> + */
> +static void idle_injection_setup(unsigned int cpu)
> +{
> +	struct sched_param param = { .sched_priority =
> MAX_USER_RT_PRIO / 2 };
> +
> +	set_freezable();
> +
> +	sched_setscheduler(current, SCHED_FIFO, &param);
> +}
> +
> +/**
> + * idle_injection_park - move out of the idle injection threads pool
> + * @cpu: the CPU number where the kthread is running on
> + *
> + * Catch the situation where the CPU is hotplugged while we were
> about
> + * to inject an idle cycle.
> + */
> +static void idle_injection_park(unsigned int cpu)
> +{
> +	struct idle_injection_device *ii_dev;
> +	struct idle_injection_thread *iit;
> +
> +	ii_dev = per_cpu(idle_injection_device, cpu);
> +	iit = per_cpu_ptr(&idle_injection_thread, cpu);
> +
> +	/*
> +	 * If we are supposed to run an idle cycle but obviously we
> +	 * can't because we are in the process of parking, so bail
> out
> +	 * from the pool of idle injection tasks, do nothing
> otherwise.
> +	 */
> +	if (!iit->should_run)
> +		return;
> +
> +	iit->should_run = 0;
> +
> +	if (atomic_dec_and_test(&ii_dev->count))
> +		idle_injection_last_man(ii_dev);
> +}
> +
> +/**
> + * idle_injection_should_run - function helper for the smpboot API
> + * @cpu: the CPU number where the kthread is running on
> + *
> + * Return: a boolean telling if the thread can run.
> + */
> +static int idle_injection_should_run(unsigned int cpu)
> +{
> +	struct idle_injection_thread *iit =
> +		per_cpu_ptr(&idle_injection_thread, cpu);
> +
> +	return iit->should_run;
> +}
> +
> +static struct idle_injection_device *ii_dev_alloc(void)
> +{
> +	struct idle_injection_device *ii_dev;
> +
> +	ii_dev = kzalloc(sizeof(*ii_dev), GFP_KERNEL);
> +	if (!ii_dev)
> +		return NULL;
> +
> +	if (!alloc_cpumask_var(&ii_dev->cpumask, GFP_KERNEL)) {
> +		kfree(ii_dev);
> +		return NULL;
> +	}
> +
> +	return ii_dev;
> +}
> +
> +static void ii_dev_free(struct idle_injection_device *ii_dev)
> +{
> +	free_cpumask_var(ii_dev->cpumask);
> +	kfree(ii_dev);
> +}
> +
> +/**
> + * idle_injection_register - idle injection init routine
> + * @cpumask: the list of CPUs managed by the idle injection device
> + *
> + * This is the initialization function in charge of creating the
> + * initializing of the timer and allocate the structures. It does
> not
> + * starts the idle injection cycles.
> + *
> + * Return: NULL if an allocation fails.
> + */
> +struct idle_injection_device *idle_injection_register(struct cpumask
> *cpumask)
> +{
> +	struct idle_injection_device *ii_dev;
> +	int cpu;
> +
> +	ii_dev = ii_dev_alloc();
> +	if (!ii_dev)
> +		return NULL;
> +
> +	cpumask_copy(ii_dev->cpumask, cpumask);
> +	hrtimer_init(&ii_dev->timer, CLOCK_MONOTONIC,
> HRTIMER_MODE_REL);
> +	ii_dev->timer.function = idle_injection_wakeup_fn;
> +	init_completion(&ii_dev->stop_complete);
> +
> +	for_each_cpu(cpu, ii_dev->cpumask) {
> +
> +		if (per_cpu(idle_injection_device, cpu)) {
> +			pr_err("cpu%d is already registered\n",
> cpu);
> +			goto out_rollback_per_cpu;
> +		}
> +
> +		per_cpu(idle_injection_device, cpu) = ii_dev;
> +	}
> +
> +	return ii_dev;
> +
> +out_rollback_per_cpu:
> +	for_each_cpu(cpu, ii_dev->cpumask)
> +		per_cpu(idle_injection_device, cpu) = NULL;
> +
> +	ii_dev_free(ii_dev);
> +
> +	return NULL;
> +}
> +
> +/**
> + * idle_injection_unregister - Unregister the idle injection device
> + * @ii_dev: a pointer to an idle injection device
> + *
> + * The function is in charge of stopping the idle injections,
> + * unregister the kthreads and free the allocated memory in the
> + * register function.
> + */
> +void idle_injection_unregister(struct idle_injection_device *ii_dev)
> +{
> +	int cpu;
> +
> +	idle_injection_stop(ii_dev);
> +
> +	for_each_cpu(cpu, ii_dev->cpumask)
> +		per_cpu(idle_injection_device, cpu) = NULL;
> +
> +	ii_dev_free(ii_dev);
> +}
> +
> +static struct smp_hotplug_thread idle_injection_threads = {
> +	.store = &idle_injection_thread.tsk,
> +	.setup = idle_injection_setup,
> +	.park = idle_injection_park,
> +	.thread_fn = idle_injection_fn,
> +	.thread_comm = "idle_inject/%u",
> +	.thread_should_run = idle_injection_should_run,
> +};
> +
> +static int __init idle_injection_init(void)
> +{
> +	return
> smpboot_register_percpu_thread(&idle_injection_threads);
> +}
> +early_initcall(idle_injection_init);
> diff --git a/include/linux/idle_injection.h
> b/include/linux/idle_injection.h
> new file mode 100644
> index 0000000..ffd20d7
> --- /dev/null
> +++ b/include/linux/idle_injection.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2018 Linaro Ltd
> + *
> + * Author: Daniel Lezcano <daniel.lezcano@linaro.org>
> + *
> + */
> +#ifndef __IDLE_INJECTION_H__
> +#define __IDLE_INJECTION_H__
> +
> +/* private idle injection device structure */
> +struct idle_injection_device;
> +
> +struct idle_injection_device *idle_injection_register(struct cpumask
> *cpumask);
> +
> +void idle_injection_unregister(struct idle_injection_device
> *ii_dev);
> +
> +int idle_injection_start(struct idle_injection_device *ii_dev);
> +
> +void idle_injection_stop(struct idle_injection_device *ii_dev);
> +
> +void idle_injection_set_duration(struct idle_injection_device
> *ii_dev,
> +				 unsigned int run_duration_ms,
> +				 unsigned int idle_duration_ms);
> +
> +void idle_injection_get_duration(struct idle_injection_device
> *ii_dev,
> +				 unsigned int *run_duration_ms,
> +				 unsigned int *idle_duration_ms);
> +#endif /* __IDLE_INJECTION_H__ */

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3290 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-13 15:54 ` Pandruvada, Srinivas
@ 2018-06-13 20:00   ` Daniel Lezcano
  2018-06-13 20:07     ` Pandruvada, Srinivas
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel Lezcano @ 2018-06-13 20:00 UTC (permalink / raw)
  To: Pandruvada, Srinivas, viresh.kumar, rjw
  Cc: linux-kernel, peterz, Zhang, Rui, edubezval, linux-pm,
	andrea.parri, kevin.wangtao, leo.yan, daniel.thompson,
	javi.merino, vincent.guittot

On 13/06/2018 17:54, Pandruvada, Srinivas wrote:
> On Tue, 2018-06-12 at 14:00 +0200, Daniel Lezcano wrote:
>> Initially, the cpu_cooling device for ARM was changed by adding a new
>> policy inserting idle cycles. The intel_powerclamp driver does a
>> similar action.
>>
>> Instead of implementing idle injections privately in the cpu_cooling
>> device, move the idle injection code in a dedicated framework and
>> give
>> the opportunity to other frameworks to make use of it.
>>
>> The framework relies on the smpboot kthreads which handles via its
>> main loop the common code for hotplugging and [un]parking.
>>
>> This code was previously tested with the cpu cooling device and went
>> through several iterations. It results now in split code and API
>> exported in the header file. It was tested with the cpu cooling
>> device
>> with success.
>>
> May be I have missed, but I don't see any use of powercap sysfs.
> 
> We created powercap sys that user space is presented a common interface
> for power capping. The RAPL driver was also submitted as cooling device
> before, but suggestion was to create this powercap.
> 
> So if powercap interface is not enough then may be we should enhance
> that.

We are creating a framework for idle injection. This framework can then
be used by the cpu_cooling device, the power_clamp and in addition a
power capping for ARM (if it makes sense).



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-13 20:00   ` Daniel Lezcano
@ 2018-06-13 20:07     ` Pandruvada, Srinivas
  2018-06-13 20:19       ` Daniel Lezcano
  0 siblings, 1 reply; 27+ messages in thread
From: Pandruvada, Srinivas @ 2018-06-13 20:07 UTC (permalink / raw)
  To: daniel.lezcano, viresh.kumar, rjw
  Cc: linux-kernel, peterz, Zhang, Rui, edubezval, linux-pm,
	andrea.parri, kevin.wangtao, leo.yan, daniel.thompson,
	javi.merino, vincent.guittot

[-- Attachment #1: Type: text/plain, Size: 1734 bytes --]

On Wed, 2018-06-13 at 22:00 +0200, Daniel Lezcano wrote:
> On 13/06/2018 17:54, Pandruvada, Srinivas wrote:
> > On Tue, 2018-06-12 at 14:00 +0200, Daniel Lezcano wrote:
> > > Initially, the cpu_cooling device for ARM was changed by adding a
> > > new
> > > policy inserting idle cycles. The intel_powerclamp driver does a
> > > similar action.
> > > 
> > > Instead of implementing idle injections privately in the
> > > cpu_cooling
> > > device, move the idle injection code in a dedicated framework and
> > > give
> > > the opportunity to other frameworks to make use of it.
> > > 
> > > The framework relies on the smpboot kthreads which handles via
> > > its
> > > main loop the common code for hotplugging and [un]parking.
> > > 
> > > This code was previously tested with the cpu cooling device and
> > > went
> > > through several iterations. It results now in split code and API
> > > exported in the header file. It was tested with the cpu cooling
> > > device
> > > with success.
> > > 
> > 
> > May be I have missed, but I don't see any use of powercap sysfs.
> > 
> > We created powercap sys that user space is presented a common
> > interface
> > for power capping. The RAPL driver was also submitted as cooling
> > device
> > before, but suggestion was to create this powercap.
> > 
> > So if powercap interface is not enough then may be we should
> > enhance
> > that.
> 
> We are creating a framework for idle injection. This framework can
> then
> be used by the cpu_cooling device, the power_clamp and in addition a
> power capping for ARM (if it makes sense).
But in this case, why in drivers/powercap folder as this is another
framework?

Thanks,
Srinivas

> 
> 
> 

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3290 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-13 20:07     ` Pandruvada, Srinivas
@ 2018-06-13 20:19       ` Daniel Lezcano
  2018-06-13 20:32         ` Pandruvada, Srinivas
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel Lezcano @ 2018-06-13 20:19 UTC (permalink / raw)
  To: Pandruvada, Srinivas, viresh.kumar, rjw
  Cc: linux-kernel, peterz, Zhang, Rui, edubezval, linux-pm,
	andrea.parri, kevin.wangtao, leo.yan, daniel.thompson,
	javi.merino, vincent.guittot

On 13/06/2018 22:07, Pandruvada, Srinivas wrote:
> On Wed, 2018-06-13 at 22:00 +0200, Daniel Lezcano wrote:
>> On 13/06/2018 17:54, Pandruvada, Srinivas wrote:
>>> On Tue, 2018-06-12 at 14:00 +0200, Daniel Lezcano wrote:
>>>> Initially, the cpu_cooling device for ARM was changed by adding a
>>>> new
>>>> policy inserting idle cycles. The intel_powerclamp driver does a
>>>> similar action.
>>>>
>>>> Instead of implementing idle injections privately in the
>>>> cpu_cooling
>>>> device, move the idle injection code in a dedicated framework and
>>>> give
>>>> the opportunity to other frameworks to make use of it.
>>>>
>>>> The framework relies on the smpboot kthreads which handles via
>>>> its
>>>> main loop the common code for hotplugging and [un]parking.
>>>>
>>>> This code was previously tested with the cpu cooling device and
>>>> went
>>>> through several iterations. It results now in split code and API
>>>> exported in the header file. It was tested with the cpu cooling
>>>> device
>>>> with success.
>>>>
>>>
>>> May be I have missed, but I don't see any use of powercap sysfs.
>>>
>>> We created powercap sys that user space is presented a common
>>> interface
>>> for power capping. The RAPL driver was also submitted as cooling
>>> device
>>> before, but suggestion was to create this powercap.
>>>
>>> So if powercap interface is not enough then may be we should
>>> enhance
>>> that.
>>
>> We are creating a framework for idle injection. This framework can
>> then
>> be used by the cpu_cooling device, the power_clamp and in addition a
>> power capping for ARM (if it makes sense).
> But in this case, why in drivers/powercap folder as this is another
> framework?

I presented at OSPM 2nd the cpu idle cooling device. It is ARM specific
but has some similarities with the power clamp driver.

Some board come with power numbers defined in their DT and from there we
are able to compute a targeted power by injecting idle cycles.

As the idle injection allows to do also power capping, Rafael told me to
move that to a framework (may be better say a component with exported
services or library) and put it in drivers/powercap.

From there we can implement a power capping, a cpu idle cooling device
and power_clamp (if the maintainer is willing to) using the same API.



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-13 20:19       ` Daniel Lezcano
@ 2018-06-13 20:32         ` Pandruvada, Srinivas
  2018-06-13 20:35           ` Daniel Lezcano
  0 siblings, 1 reply; 27+ messages in thread
From: Pandruvada, Srinivas @ 2018-06-13 20:32 UTC (permalink / raw)
  To: daniel.lezcano, viresh.kumar, rjw
  Cc: linux-kernel, peterz, Zhang, Rui, edubezval, linux-pm,
	andrea.parri, kevin.wangtao, leo.yan, daniel.thompson,
	javi.merino, vincent.guittot

[-- Attachment #1: Type: text/plain, Size: 2642 bytes --]

On Wed, 2018-06-13 at 22:19 +0200, Daniel Lezcano wrote:
> On 13/06/2018 22:07, Pandruvada, Srinivas wrote:
> > On Wed, 2018-06-13 at 22:00 +0200, Daniel Lezcano wrote:
> > > On 13/06/2018 17:54, Pandruvada, Srinivas wrote:
> > > > On Tue, 2018-06-12 at 14:00 +0200, Daniel Lezcano wrote:
> > > > > Initially, the cpu_cooling device for ARM was changed by
> > > > > adding a
> > > > > new
> > > > > policy inserting idle cycles. The intel_powerclamp driver
> > > > > does a
> > > > > similar action.
> > > > > 
> > > > > Instead of implementing idle injections privately in the
> > > > > cpu_cooling
> > > > > device, move the idle injection code in a dedicated framework
> > > > > and
> > > > > give
> > > > > the opportunity to other frameworks to make use of it.
> > > > > 
> > > > > The framework relies on the smpboot kthreads which handles
> > > > > via
> > > > > its
> > > > > main loop the common code for hotplugging and [un]parking.
> > > > > 
> > > > > This code was previously tested with the cpu cooling device
> > > > > and
> > > > > went
> > > > > through several iterations. It results now in split code and
> > > > > API
> > > > > exported in the header file. It was tested with the cpu
> > > > > cooling
> > > > > device
> > > > > with success.
> > > > > 
> > > > 
> > > > May be I have missed, but I don't see any use of powercap
> > > > sysfs.
> > > > 
> > > > We created powercap sys that user space is presented a common
> > > > interface
> > > > for power capping. The RAPL driver was also submitted as
> > > > cooling
> > > > device
> > > > before, but suggestion was to create this powercap.
> > > > 
> > > > So if powercap interface is not enough then may be we should
> > > > enhance
> > > > that.
> > > 
> > > We are creating a framework for idle injection. This framework
> > > can
> > > then
> > > be used by the cpu_cooling device, the power_clamp and in
> > > addition a
> > > power capping for ARM (if it makes sense).
> > 
> > But in this case, why in drivers/powercap folder as this is another
> > framework?
> 
> I presented at OSPM 2nd the cpu idle cooling device. It is ARM
> specific
> but has some similarities with the power clamp driver.
> 
> Some board come with power numbers defined in their DT and from there
> we
> are able to compute a targeted power by injecting idle cycles.
> 
> As the idle injection allows to do also power capping, Rafael told me
> to
> move that to a framework (may be better say a component with exported
> services or library) and put it in drivers/powercap.
So Rafael wants it to be in powercap.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3290 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-13 20:32         ` Pandruvada, Srinivas
@ 2018-06-13 20:35           ` Daniel Lezcano
  2018-06-13 21:27             ` Rafael J. Wysocki
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel Lezcano @ 2018-06-13 20:35 UTC (permalink / raw)
  To: Pandruvada, Srinivas, viresh.kumar, rjw
  Cc: linux-kernel, peterz, Zhang, Rui, edubezval, linux-pm,
	andrea.parri, kevin.wangtao, leo.yan, daniel.thompson,
	javi.merino, vincent.guittot

On 13/06/2018 22:32, Pandruvada, Srinivas wrote:
> On Wed, 2018-06-13 at 22:19 +0200, Daniel Lezcano wrote:
>> On 13/06/2018 22:07, Pandruvada, Srinivas wrote:
>>> On Wed, 2018-06-13 at 22:00 +0200, Daniel Lezcano wrote:
>>>> On 13/06/2018 17:54, Pandruvada, Srinivas wrote:
>>>>> On Tue, 2018-06-12 at 14:00 +0200, Daniel Lezcano wrote:
>>>>>> Initially, the cpu_cooling device for ARM was changed by
>>>>>> adding a
>>>>>> new
>>>>>> policy inserting idle cycles. The intel_powerclamp driver
>>>>>> does a
>>>>>> similar action.
>>>>>>
>>>>>> Instead of implementing idle injections privately in the
>>>>>> cpu_cooling
>>>>>> device, move the idle injection code in a dedicated framework
>>>>>> and
>>>>>> give
>>>>>> the opportunity to other frameworks to make use of it.
>>>>>>
>>>>>> The framework relies on the smpboot kthreads which handles
>>>>>> via
>>>>>> its
>>>>>> main loop the common code for hotplugging and [un]parking.
>>>>>>
>>>>>> This code was previously tested with the cpu cooling device
>>>>>> and
>>>>>> went
>>>>>> through several iterations. It results now in split code and
>>>>>> API
>>>>>> exported in the header file. It was tested with the cpu
>>>>>> cooling
>>>>>> device
>>>>>> with success.
>>>>>>
>>>>>
>>>>> May be I have missed, but I don't see any use of powercap
>>>>> sysfs.
>>>>>
>>>>> We created powercap sys that user space is presented a common
>>>>> interface
>>>>> for power capping. The RAPL driver was also submitted as
>>>>> cooling
>>>>> device
>>>>> before, but suggestion was to create this powercap.
>>>>>
>>>>> So if powercap interface is not enough then may be we should
>>>>> enhance
>>>>> that.
>>>>
>>>> We are creating a framework for idle injection. This framework
>>>> can
>>>> then
>>>> be used by the cpu_cooling device, the power_clamp and in
>>>> addition a
>>>> power capping for ARM (if it makes sense).
>>>
>>> But in this case, why in drivers/powercap folder as this is another
>>> framework?
>>
>> I presented at OSPM 2nd the cpu idle cooling device. It is ARM
>> specific
>> but has some similarities with the power clamp driver.
>>
>> Some board come with power numbers defined in their DT and from there
>> we
>> are able to compute a targeted power by injecting idle cycles.
>>
>> As the idle injection allows to do also power capping, Rafael told me
>> to
>> move that to a framework (may be better say a component with exported
>> services or library) and put it in drivers/powercap.
> So Rafael wants it to be in powercap.

That is what I understood.




-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-13 20:35           ` Daniel Lezcano
@ 2018-06-13 21:27             ` Rafael J. Wysocki
  0 siblings, 0 replies; 27+ messages in thread
From: Rafael J. Wysocki @ 2018-06-13 21:27 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Pandruvada, Srinivas, viresh.kumar, rjw, linux-kernel, peterz,
	Zhang, Rui, edubezval, linux-pm, andrea.parri, kevin.wangtao,
	leo.yan, daniel.thompson, javi.merino, vincent.guittot

On Wed, Jun 13, 2018 at 10:35 PM, Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
> On 13/06/2018 22:32, Pandruvada, Srinivas wrote:
>> On Wed, 2018-06-13 at 22:19 +0200, Daniel Lezcano wrote:
>>> On 13/06/2018 22:07, Pandruvada, Srinivas wrote:
>>>> On Wed, 2018-06-13 at 22:00 +0200, Daniel Lezcano wrote:
>>>>> On 13/06/2018 17:54, Pandruvada, Srinivas wrote:
>>>>>> On Tue, 2018-06-12 at 14:00 +0200, Daniel Lezcano wrote:
>>>>>>> Initially, the cpu_cooling device for ARM was changed by
>>>>>>> adding a
>>>>>>> new
>>>>>>> policy inserting idle cycles. The intel_powerclamp driver
>>>>>>> does a
>>>>>>> similar action.
>>>>>>>
>>>>>>> Instead of implementing idle injections privately in the
>>>>>>> cpu_cooling
>>>>>>> device, move the idle injection code in a dedicated framework
>>>>>>> and
>>>>>>> give
>>>>>>> the opportunity to other frameworks to make use of it.
>>>>>>>
>>>>>>> The framework relies on the smpboot kthreads which handles
>>>>>>> via
>>>>>>> its
>>>>>>> main loop the common code for hotplugging and [un]parking.
>>>>>>>
>>>>>>> This code was previously tested with the cpu cooling device
>>>>>>> and
>>>>>>> went
>>>>>>> through several iterations. It results now in split code and
>>>>>>> API
>>>>>>> exported in the header file. It was tested with the cpu
>>>>>>> cooling
>>>>>>> device
>>>>>>> with success.
>>>>>>>
>>>>>>
>>>>>> May be I have missed, but I don't see any use of powercap
>>>>>> sysfs.
>>>>>>
>>>>>> We created powercap sys that user space is presented a common
>>>>>> interface
>>>>>> for power capping. The RAPL driver was also submitted as
>>>>>> cooling
>>>>>> device
>>>>>> before, but suggestion was to create this powercap.
>>>>>>
>>>>>> So if powercap interface is not enough then may be we should
>>>>>> enhance
>>>>>> that.
>>>>>
>>>>> We are creating a framework for idle injection. This framework
>>>>> can
>>>>> then
>>>>> be used by the cpu_cooling device, the power_clamp and in
>>>>> addition a
>>>>> power capping for ARM (if it makes sense).
>>>>
>>>> But in this case, why in drivers/powercap folder as this is another
>>>> framework?
>>>
>>> I presented at OSPM 2nd the cpu idle cooling device. It is ARM
>>> specific
>>> but has some similarities with the power clamp driver.
>>>
>>> Some board come with power numbers defined in their DT and from there
>>> we
>>> are able to compute a targeted power by injecting idle cycles.
>>>
>>> As the idle injection allows to do also power capping, Rafael told me
>>> to
>>> move that to a framework (may be better say a component with exported
>>> services or library) and put it in drivers/powercap.
>> So Rafael wants it to be in powercap.
>
> That is what I understood.

While this is not exactly power capping, the purpose of this mechanism
is to prevent too much power from being drawn by the system, so IMO it
falls under the same broad category.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework
  2018-06-12 17:35                 ` Peter Zijlstra
  2018-06-13  8:55                   ` Viresh Kumar
@ 2018-06-15  8:13                   ` Daniel Lezcano
  1 sibling, 0 replies; 27+ messages in thread
From: Daniel Lezcano @ 2018-06-15  8:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: viresh.kumar, rjw, linux-kernel, linux-pm, Eduardo Valentin,
	Javi Merino, Leo Yan, Kevin Wangtao, Vincent Guittot, Rui Zhang,
	Daniel Thompson, Andrea Parri

On 12/06/2018 19:35, Peter Zijlstra wrote:

[ ... ]

>>>> Yes, drifting is not an issue if that happens. This scheme is simpler
>>>> and safer than setting the timer ahead before waking up the tasks with
>>>> the risk it expires before all the tasks ended their idle cycles.
>>>
>>> sloppy though..
>>
>> Ok, do you prefer to see the timer set in the wakeup function and thus
>> having a periodic tick for the idle injection ?
> 
> I think having a HRTIMER_RESTART handler that does hrtimer_forward() is
> the most sensible. You will end up having to deal with threads not being
> ready, but I think that's not a real problem.

Thanks for the restartable timer suggestion. I'm finishing polishing the
code and definitively the result is much more nicer.


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2018-06-15  8:13 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-12 12:00 [PATCH V6] powercap/drivers/idle_injection: Add an idle injection framework Daniel Lezcano
2018-06-12 12:30 ` Peter Zijlstra
2018-06-12 12:44   ` Daniel Lezcano
2018-06-12 12:52     ` Peter Zijlstra
2018-06-12 13:02       ` Daniel Lezcano
2018-06-12 14:06         ` Peter Zijlstra
2018-06-12 14:37           ` Daniel Lezcano
2018-06-12 15:58             ` Peter Zijlstra
2018-06-12 17:02               ` Daniel Lezcano
2018-06-12 17:35                 ` Peter Zijlstra
2018-06-13  8:55                   ` Viresh Kumar
2018-06-13  9:03                     ` Daniel Lezcano
2018-06-13  9:10                       ` Viresh Kumar
2018-06-15  8:13                   ` Daniel Lezcano
2018-06-12 12:59 ` Peter Zijlstra
2018-06-13  8:24   ` Viresh Kumar
2018-06-12 14:23 ` Peter Zijlstra
2018-06-12 14:45   ` Daniel Lezcano
2018-06-12 14:26 ` Peter Zijlstra
2018-06-12 14:46   ` Daniel Lezcano
2018-06-13 15:54 ` Pandruvada, Srinivas
2018-06-13 20:00   ` Daniel Lezcano
2018-06-13 20:07     ` Pandruvada, Srinivas
2018-06-13 20:19       ` Daniel Lezcano
2018-06-13 20:32         ` Pandruvada, Srinivas
2018-06-13 20:35           ` Daniel Lezcano
2018-06-13 21:27             ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).