All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
@ 2011-05-27  4:53 MyungJoo Ham
  2011-05-27  4:53 ` [PATCH v3 2/3] PM / DEVFREQ: add example governors MyungJoo Ham
                   ` (7 more replies)
  0 siblings, 8 replies; 35+ messages in thread
From: MyungJoo Ham @ 2011-05-27  4:53 UTC (permalink / raw)
  To: linux-pm
  Cc: linux-kernel, Len Brown, Pavel Machek, Rafael J. Wysocki,
	Greg Kroah-Hartman, Kyungmin Park, Jiejing Zhang, Colin Cross,
	Nishanth Menon, Thomas Gleixner, myungjoo.ham

With OPPs, a device may have multiple operable frequency and voltage
sets. However, there can be multiple possible operable sets and a system
will need to choose one from them. In order to reduce the power
consumption (by reducing frequency and voltage) without affecting the
performance too much, a Dynamic Voltage and Frequency Scaling (DVFS)
scheme may be used.

This patch introduces the DVFS capability to non-CPU devices with OPPs.
DVFS is a techique whereby the frequency and supplied voltage of a
device is adjusted on-the-fly. DVFS usually sets the frequency as low
as possible with given conditions (such as QoS assurance) and adjusts
voltage according to the chosen frequency in order to reduce power
consumption and heat dissipation.

The generic DVFS for devices, DEVFREQ, may appear quite similar with
/drivers/cpufreq.  However, CPUFREQ does not allow to have multiple
devices registered and is not suitable to have multiple heterogenous
devices with different (but simple) governors.

Normally, DVFS mechanism controls frequency based on the demand for
the device, and then, chooses voltage based on the chosen frequency.
DEVFREQ also controls the frequency based on the governor's frequency
recommendation and let OPP pick up the pair of frequency and voltage
based on the recommended frequency. Then, the chosen OPP is passed to
device driver's "target" callback.

Tested with Exynos4-NURI board.

Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>

--
Thank you for your valuable comments, Rafael, Greg, Pavel, and Colin.

Changed from v2
- Code style revised and cleaned up.
- Remove DEVFREQ entries that incur errors except for EAGAIN
- Bug fixed: tickle for devices without polling governors

Changes from v1(RFC)
- Rename: DVFS --> DEVFREQ
- Revised governor design
    . Governor receives the whole struct devfreq
    . Governor should gather usage information (thru get_dev_status)
itself
- Periodic monitoring runs only when needed.
- DEVFREQ no more deals with voltage information directly
- Removed some printks.
- Some cosmetics update
- Use freezable_wq.
---
 drivers/base/power/Makefile  |    1 +
 drivers/base/power/devfreq.c |  397 ++++++++++++++++++++++++++++++++++++++++++
 drivers/base/power/opp.c     |    9 +
 include/linux/devfreq.h      |  108 ++++++++++++
 kernel/power/Kconfig         |   34 ++++
 5 files changed, 549 insertions(+), 0 deletions(-)
 create mode 100644 drivers/base/power/devfreq.c
 create mode 100644 include/linux/devfreq.h

diff --git a/drivers/base/power/Makefile b/drivers/base/power/Makefile
index 3647e11..20118dc 100644
--- a/drivers/base/power/Makefile
+++ b/drivers/base/power/Makefile
@@ -4,5 +4,6 @@ obj-$(CONFIG_PM_RUNTIME)	+= runtime.o
 obj-$(CONFIG_PM_TRACE_RTC)	+= trace.o
 obj-$(CONFIG_PM_OPP)	+= opp.o
 obj-$(CONFIG_HAVE_CLK)	+= clock_ops.o
+obj-$(CONFIG_PM_DEVFREQ)	+= devfreq.o
 
 ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG
\ No newline at end of file
diff --git a/drivers/base/power/devfreq.c b/drivers/base/power/devfreq.c
new file mode 100644
index 0000000..883d953
--- /dev/null
+++ b/drivers/base/power/devfreq.c
@@ -0,0 +1,397 @@
+/*
+ * DEVFREQ: Generic Dynamic Voltage and Frequency Scaling (DVFS) Framework
+ *	    for Non-CPU Devices Based on OPP.
+ *
+ * Copyright (C) 2011 Samsung Electronics
+ *	MyungJoo Ham <myungjoo.ham@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/err.h>
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/opp.h>
+#include <linux/devfreq.h>
+#include <linux/workqueue.h>
+#include <linux/platform_device.h>
+#include <linux/list.h>
+#include <linux/printk.h>
+
+/*
+ * DEVFREQ Monitoring Interval in ms.
+ * It is recommended to be "jiffy_in_ms" * n, where n is an integer >= 1.
+ */
+#define DEVFREQ_INTERVAL	20
+
+/*
+ * devfreq_work periodically (given by DEVFREQ_INTERVAL) monitors every
+ * registered device.
+ */
+static bool polling;
+static struct workqueue_struct *devfreq_wq;
+static struct delayed_work devfreq_work;
+/* The list of all device-devfreq */
+static LIST_HEAD(devfreq_list);
+/* Exclusive access to devfreq_list and its elements */
+static DEFINE_MUTEX(devfreq_list_lock);
+
+/**
+ * find_device_devfreq() - find devfreq struct using device pointer
+ * @dev:	device pointer used to lookup device DEVFREQ.
+ *
+ * Search the list of device DEVFREQs and return the matched device's
+ * DEVFREQ info. devfreq_list_lock should be held by the caller.
+ */
+static struct devfreq *find_device_devfreq(struct device *dev)
+{
+	struct devfreq *tmp_devfreq;
+
+	if (unlikely(IS_ERR_OR_NULL(dev))) {
+		pr_err("%s: Invalid parameters\n", __func__);
+		return ERR_PTR(-EINVAL);
+	}
+
+	list_for_each_entry(tmp_devfreq, &devfreq_list, node) {
+		if (tmp_devfreq->dev == dev)
+			return tmp_devfreq;
+	}
+
+	return ERR_PTR(-ENODEV);
+}
+
+/**
+ * devfreq_do() - Check the usage profile of a given device and configure
+ *		frequency and voltage accordingly
+ * @devfreq:	DEVFREQ info of the given device
+ */
+static int devfreq_do(struct devfreq *devfreq)
+{
+	struct opp *opp;
+	unsigned long freq;
+	int err;
+
+	err = devfreq->governor->get_target_freq(devfreq, &freq);
+	if (err)
+		return err;
+
+	opp = opp_find_freq_ceil(devfreq->dev, &freq);
+	if (opp == ERR_PTR(-ENODEV))
+		opp = opp_find_freq_floor(devfreq->dev, &freq);
+
+	if (IS_ERR(opp))
+		return PTR_ERR(opp);
+
+	if (devfreq->previous_freq == freq)
+		return 0;
+
+	err = devfreq->profile->target(devfreq->dev, opp);
+	if (err)
+		return err;
+
+	devfreq->previous_freq = freq;
+	return 0;
+}
+
+/**
+ * devfreq_monitor() - Regularly run devfreq_do() and support device DEVFREQ tickle.
+ * @work: the work struct used to run devfreq_monitor periodically.
+ */
+static void devfreq_monitor(struct work_struct *work)
+{
+	struct devfreq *devfreq;
+	int error;
+	bool continue_polling = false;
+	struct devfreq *to_remove = NULL;
+
+	mutex_lock(&devfreq_list_lock);
+
+	list_for_each_entry(devfreq, &devfreq_list, node) {
+		/* Remove the devfreq entry that failed */
+		if (to_remove) {
+			list_del(&to_remove->node);
+			kfree(to_remove);
+			to_remove = NULL;
+		}
+
+		/*
+		 * If the device is tickled and the tickle duration is left,
+		 * do not change the frequency for a while
+		 */
+		if (devfreq->tickle) {
+			continue_polling = true;
+			devfreq->tickle--;
+
+			/*
+			 * If the tickle is ending and the device is not going
+			 * to poll, force the device to poll next time so that
+			 * it can return to the original frequency afterwards.
+			 * However, non-polling device will have 0 polling_ms,
+			 * it will not poll again later.
+			 */
+			if (devfreq->tickle == 0 && devfreq->next_polling == 0)
+				devfreq->next_polling = 1;
+
+			continue;
+		}
+
+		/* This device does not require polling */
+		if (devfreq->next_polling == 0)
+			continue;
+
+		continue_polling = true;
+
+		if (devfreq->next_polling == 1) {
+			/* This device is polling this time */
+			error = devfreq_do(devfreq);
+			if (error && error != -EAGAIN) {
+				/*
+				 * Remove a devfreq with error. However,
+				 * We cannot remove it right here because the
+				 * devfreq pointer is going to be used by
+				 * list_for_each_entry above. Thus, it is
+				 * removed afterwards.
+				 */
+				to_remove = devfreq->dev;
+				dev_err(devfreq->dev, "devfreq_do error(%d). "
+					"DEVFREQ is removed from the device\n",
+					error);
+				continue;
+			}
+			devfreq->next_polling = DIV_ROUND_UP(
+						devfreq->profile->polling_ms,
+						DEVFREQ_INTERVAL);
+		} else {
+			/* The device will poll later when next_polling = 1 */
+			devfreq->next_polling--;
+		}
+	}
+
+	if (to_remove) {
+		list_del(&to_remove->node);
+		kfree(to_remove);
+		to_remove = NULL;
+	}
+
+	if (continue_polling) {
+		polling = true;
+		queue_delayed_work(devfreq_wq, &devfreq_work,
+				   msecs_to_jiffies(DEVFREQ_INTERVAL));
+	} else {
+		polling = false;
+	}
+
+	mutex_unlock(&devfreq_list_lock);
+}
+
+/**
+ * devfreq_add_device() - Add devfreq feature to the device
+ * @dev:	the device to add devfreq feature.
+ * @profile:	device-specific profile to run devfreq.
+ * @governor:	the policy to choose frequency.
+ */
+int devfreq_add_device(struct device *dev, struct devfreq_dev_profile *profile,
+		       struct devfreq_governor *governor)
+{
+	struct devfreq *new_devfreq, *devfreq;
+	int err = 0;
+
+	if (!dev || !profile || !governor) {
+		dev_err(dev, "%s: Invalid parameters.\n", __func__);
+		return -EINVAL;
+	}
+
+	mutex_lock(&devfreq_list_lock);
+
+	devfreq = find_device_devfreq(dev);
+	if (!IS_ERR(devfreq)) {
+		dev_err(dev, "%s: Unable to create DEVFREQ for the device. "
+			"It already has one.\n", __func__);
+		err = -EINVAL;
+		goto out;
+	}
+
+	new_devfreq = kzalloc(sizeof(struct devfreq), GFP_KERNEL);
+	if (!new_devfreq) {
+		dev_err(dev, "%s: Unable to create DEVFREQ for the device\n",
+			__func__);
+		err = -ENOMEM;
+		goto out;
+	}
+
+	new_devfreq->dev = dev;
+	new_devfreq->profile = profile;
+	new_devfreq->governor = governor;
+	new_devfreq->next_polling = DIV_ROUND_UP(profile->polling_ms,
+						 DEVFREQ_INTERVAL);
+	new_devfreq->previous_freq = profile->initial_freq;
+
+	list_add(&new_devfreq->node, &devfreq_list);
+
+	if (devfreq_wq && new_devfreq->next_polling && !polling) {
+		polling = true;
+		queue_delayed_work(devfreq_wq, &devfreq_work,
+				   msecs_to_jiffies(DEVFREQ_INTERVAL));
+	}
+out:
+	mutex_unlock(&devfreq_list_lock);
+
+	return err;
+}
+
+/**
+ * devfreq_remove_device() - Remove DEVFREQ feature from a device.
+ * @device:	the device to remove devfreq feature.
+ */
+int devfreq_remove_device(struct device *dev)
+{
+	struct devfreq *devfreq;
+
+	if (!dev)
+		return -EINVAL;
+
+	mutex_lock(&devfreq_list_lock);
+	devfreq = find_device_devfreq(dev);
+	if (IS_ERR(devfreq)) {
+		dev_err(dev, "%s: Unable to find DEVFREQ entry for the device.\n",
+			__func__);
+		mutex_unlock(&devfreq_list_lock);
+		return -EINVAL;
+	}
+
+	list_del(&devfreq->node);
+
+	kfree(devfreq);
+
+	mutex_unlock(&devfreq_list_lock);
+
+	return 0;
+}
+
+/**
+ * devfreq_update() - Notify that the device OPP has been changed.
+ * @dev:	the device whose OPP has been changed.
+ * @may_not_exist:	do not print error message even if the device
+ *			does not have devfreq entry.
+ */
+int devfreq_update(struct device *dev)
+{
+	struct devfreq *devfreq;
+	int err = 0;
+
+	mutex_lock(&devfreq_list_lock);
+
+	devfreq = find_device_devfreq(dev);
+	if (IS_ERR(devfreq)) {
+		err = PTR_ERR(devfreq);
+		goto out;
+	}
+
+	if (devfreq->tickle) {
+		/* If the max freq available is changed, re-tickle */
+		unsigned long freq = devfreq->profile->max_freq;
+		struct opp *opp = opp_find_freq_floor(devfreq->dev, &freq);
+
+		if (IS_ERR(opp)) {
+			err = PTR_ERR(opp);
+			goto out;
+		}
+
+		/* Max freq available is not changed */
+		if (devfreq->previous_freq == freq)
+			goto out;
+
+		err = devfreq->profile->target(devfreq->dev, opp);
+		if (!err)
+			devfreq->previous_freq = freq;
+	} else {
+		/* Reevaluate the proper frequency */
+		err = devfreq_do(devfreq);
+	}
+
+out:
+	mutex_unlock(&devfreq_list_lock);
+	return err;
+}
+
+static int _devfreq_tickle_device(struct devfreq *df, unsigned long delay)
+{
+	int err = 0;
+	unsigned long freq;
+	struct opp *opp;
+
+	freq = df->profile->max_freq;
+	opp = opp_find_freq_floor(df->dev, &freq);
+	if (IS_ERR(opp))
+		return PTR_ERR(opp);
+
+	if (df->previous_freq != freq) {
+		err = df->profile->target(df->dev, opp);
+		if (!err)
+			df->previous_freq = freq;
+	}
+	if (err) {
+		dev_err(df->dev, "%s: Cannot set frequency.\n", __func__);
+	} else {
+		df->tickle = delay;
+		df->num_tickle++;
+	}
+
+	if (devfreq_wq && !polling) {
+		polling = true;
+		queue_delayed_work(devfreq_wq, &devfreq_work,
+				msecs_to_jiffies(DEVFREQ_INTERVAL));
+	}
+
+	return err;
+}
+
+/**
+ * devfreq_tickle_device() - Guarantee maximum operation speed for a while
+ *			instaneously.
+ * @dev:	the device to be tickled.
+ * @duration_ms:	the duration of tickle effect.
+ *
+ * Tickle sets the device at the maximum frequency instaneously and
+ * the maximum frequency is guaranteed to be used for the given duration.
+ * For faster user reponse time, an input event may tickle a related device
+ * so that the input event does not need to wait for the DEVFREQ to react with
+ * normal interval.
+ */
+int devfreq_tickle_device(struct device *dev, unsigned long duration_ms)
+{
+	struct devfreq *devfreq;
+	int err = 0;
+	unsigned long delay; /* in # of DEVFREQ_INTERVAL */
+
+	mutex_lock(&devfreq_list_lock);
+	devfreq = find_device_devfreq(dev);
+	delay = DIV_ROUND_UP(duration_ms, DEVFREQ_INTERVAL);
+
+	if (IS_ERR(devfreq))
+		err = PTR_ERR(devfreq);
+	else
+		err = _devfreq_tickle_device(devfreq, delay);
+
+	mutex_unlock(&devfreq_list_lock);
+
+	return err;
+}
+
+static int __init devfreq_init(void)
+{
+	mutex_lock(&devfreq_list_lock);
+
+	polling = false;
+	devfreq_wq = create_freezable_workqueue("devfreq_wq");
+	INIT_DELAYED_WORK_DEFERRABLE(&devfreq_work, devfreq_monitor);
+	mutex_unlock(&devfreq_list_lock);
+
+	devfreq_monitor(&devfreq_work.work);
+	return 0;
+}
+late_initcall(devfreq_init);
diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c
index 56a6899..819c1b3 100644
--- a/drivers/base/power/opp.c
+++ b/drivers/base/power/opp.c
@@ -21,6 +21,7 @@
 #include <linux/rculist.h>
 #include <linux/rcupdate.h>
 #include <linux/opp.h>
+#include <linux/devfreq.h>
 
 /*
  * Internal data structure organization with the OPP layer library is as
@@ -428,6 +429,11 @@ int opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
 	list_add_rcu(&new_opp->node, head);
 	mutex_unlock(&dev_opp_list_lock);
 
+	/*
+	 * Notify generic dvfs for the change and ignore error
+	 * because the device may not have a devfreq entry
+	 */
+	devfreq_update(dev);
 	return 0;
 }
 
@@ -512,6 +518,9 @@ unlock:
 	mutex_unlock(&dev_opp_list_lock);
 out:
 	kfree(new_opp);
+
+	/* Notify generic dvfs for the change and ignore error */
+	devfreq_update(dev);
 	return r;
 }
 
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
new file mode 100644
index 0000000..59c8bda
--- /dev/null
+++ b/include/linux/devfreq.h
@@ -0,0 +1,108 @@
+/*
+ * DEVFREQ: Generic Dynamic Voltage and Frequency Scaling (DVFS) Framework
+ *	    for Non-CPU Devices Based on OPP.
+ *
+ * Copyright (C) 2011 Samsung Electronics
+ *	MyungJoo Ham <myungjoo.ham@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __LINUX_DEVFREQ_H__
+#define __LINUX_DEVFREQ_H__
+
+struct devfreq;
+struct devfreq_dev_status {
+	/* both since the last measure */
+	unsigned long total_time;
+	unsigned long busy_time;
+	unsigned long current_frequency;
+};
+
+struct devfreq_dev_profile {
+	unsigned long max_freq; /* may be larger than the actual value */
+	unsigned long initial_freq;
+	int polling_ms;	/* 0 for at opp change only */
+
+	int (*target)(struct device *dev, struct opp *opp);
+	int (*get_dev_status)(struct device *dev,
+			      struct devfreq_dev_status *stat);
+};
+
+/**
+ * struct devfreq_governor - DEVFREQ Policy Governor
+ * @data	Governor's internal data. The framework does not care of it.
+ * @get_target_freq	Returns desired operating frequency for the device.
+ *			Basically, get_target_freq will run
+ *			devfreq_dev_profile.get_dev_status() to get the
+ *			status of the device (load = busy_time / total_time).
+ */
+struct devfreq_governor {
+	void *data; /* private data for get_target_freq */
+	int (*get_target_freq)(struct devfreq *this, unsigned long *freq);
+};
+
+/**
+ * struct devfreq - Device DEVFREQ structure
+ * @node	list node - contains the devices with DEVFREQ that have been
+ *		registered.
+ * @dev		device pointer
+ * @profile	device-specific devfreq profile
+ * @governor	method how to choose frequency based on the usage.
+ * @previous_freq	previously configured frequency value.
+ * @next_polling	the number of remaining "devfreq_monitor" executions to
+ *			reevaluate frequency/voltage of the device. Set by
+ *			profile's polling_ms interval.
+ * @tickle	positive if DEVFREQ-tickling is activated for the device.
+ *		at each executino of devfreq_monitor, tickle is decremented.
+ *		User may tickle a device-devfreq in order to set maximum
+ *		frequency instaneously with some guaranteed duration.
+ *
+ * This structure stores the DEVFREQ information for a give device.
+ */
+struct devfreq {
+	struct list_head node;
+
+	struct device *dev;
+	struct devfreq_dev_profile *profile;
+	struct devfreq_governor *governor;
+
+	unsigned long previous_freq;
+	unsigned int next_polling;
+	unsigned int tickle;
+};
+
+#if defined(CONFIG_PM_DEVFREQ)
+extern int devfreq_add_device(struct device *dev,
+			   struct devfreq_dev_profile *profile,
+			   struct devfreq_governor *governor);
+extern int devfreq_remove_device(struct device *dev);
+extern int devfreq_update(struct device *dev);
+extern int devfreq_tickle_device(struct device *dev, unsigned long duration_ms);
+#else /* !CONFIG_PM_DEVFREQ */
+static int devfreq_add_device(struct device *dev,
+			   struct devfreq_dev_profile *profile,
+			   struct devfreq_governor *governor)
+{
+	return 0;
+}
+
+static int devfreq_remove_device(struct device *dev)
+{
+	return 0;
+}
+
+static int devfreq_update(struct device *dev)
+{
+	return 0;
+}
+
+static int devfreq_tickle_device(struct device *dev, unsigned long duration_ms)
+{
+	return 0;
+}
+#endif /* CONFIG_PM_DEVFREQ */
+
+#endif /* __LINUX_DEVFREQ_H__ */
diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig
index 87f4d24..b7e15c8 100644
--- a/kernel/power/Kconfig
+++ b/kernel/power/Kconfig
@@ -227,3 +227,37 @@ config PM_OPP
 config PM_RUNTIME_CLK
 	def_bool y
 	depends on PM_RUNTIME && HAVE_CLK
+
+config ARCH_HAS_DEVFREQ
+	bool
+	depends on ARCH_HAS_OPP
+	help
+	  Denotes that the architecture supports DEVFREQ. If the architecture
+	  supports multiple OPP entries per device and the frequency of the
+	  devices with OPPs may be altered dynamically, the architecture
+	  supports DEVFREQ.
+
+config PM_DEVFREQ
+	bool "Generic Dynamic Voltage and Frequency Scaling (DVFS) Framework"
+	depends on PM_OPP && ARCH_HAS_DEVFREQ
+	help
+	  With OPP support, a device may have a list of frequencies and
+	  voltages available. DEVFREQ, a generic DVFS framework can be
+	  registered for a device with OPP support in order to let the
+	  governor provided to DEVFREQ choose an operating frequency
+	  based on the OPP's list and the policy given with DEVFREQ.
+
+	  Each device may have its own governor and policy. DEVFREQ can
+	  reevaluate the device state periodically and/or based on the
+	  OPP list changes (each frequency/voltage pair in OPP may be
+	  disabled or enabled).
+
+	  Like some CPUs with CPUFREQ, a device may have multiple clocks.
+	  However, because the clock frequencies of a single device are
+	  determined by the single device's state, an instance of DEVFREQ
+	  is attached to a single device and returns a "representative"
+	  clock frequency from the OPP of the device, which is also attached
+	  to a device by 1-to-1. The device registering DEVFREQ takes the
+	  responsiblity to "interpret" the frequency listed in OPP and
+	  to set its every clock accordingly with the "target" callback
+	  given to DEVFREQ.
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 2/3] PM / DEVFREQ: add example governors
  2011-05-27  4:53 [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs MyungJoo Ham
@ 2011-05-27  4:53 ` MyungJoo Ham
  2011-07-02 21:58   ` Rafael J. Wysocki
  2011-07-02 21:58   ` Rafael J. Wysocki
  2011-05-27  4:53 ` MyungJoo Ham
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 35+ messages in thread
From: MyungJoo Ham @ 2011-05-27  4:53 UTC (permalink / raw)
  To: linux-pm
  Cc: linux-kernel, Len Brown, Pavel Machek, Rafael J. Wysocki,
	Greg Kroah-Hartman, Kyungmin Park, Jiejing Zhang, Colin Cross,
	Nishanth Menon, Thomas Gleixner, myungjoo.ham

Three CPUFREQ-like governors are provided as examples.

powersave: use the lowest frequency possible. The user (device) should
set the polling_ms as 0 because polling is useless for this governor.

performance: use the highest freqeuncy possible. The user (device)
should set the polling_ms as 0 because polling is useless for this
governor.

simple_ondemand: simplified version of CPUFREQ's ONDEMAND governor.

When a user updates OPP entries (enable/disable/add), OPP framework
automatically notifies DEVFREQ to update operating frequency
accordingly. Thus, DEVFREQ users (device drivers) do not need to update
DEVFREQ manually with OPP entry updates or set polling_ms for powersave
, performance, or any other "static" governors.

Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/base/power/devfreq.c |   69 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/devfreq.h      |    5 +++
 2 files changed, 74 insertions(+), 0 deletions(-)

diff --git a/drivers/base/power/devfreq.c b/drivers/base/power/devfreq.c
index 883d953..7648a94 100644
--- a/drivers/base/power/devfreq.c
+++ b/drivers/base/power/devfreq.c
@@ -395,3 +395,72 @@ static int __init devfreq_init(void)
 	return 0;
 }
 late_initcall(devfreq_init);
+
+static int devfreq_powersave_func(struct devfreq *df,
+				  unsigned long *freq)
+{
+	*freq = 0; /* devfreq_do will run "ceiling" to 0 */
+	return 0;
+}
+
+struct devfreq_governor devfreq_powersave = {
+	.get_target_freq = devfreq_powersave_func,
+};
+
+static int devfreq_performance_func(struct devfreq *df,
+				    unsigned long *freq)
+{
+	*freq = UINT_MAX; /* devfreq_do will run "floor" */
+	return 0;
+}
+
+struct devfreq_governor devfreq_performance = {
+	.get_target_freq = devfreq_performance_func,
+};
+
+/* Constants for DevFreq-Simple-Ondemand (DFSO) */
+#define DFSO_UPTHRESHOLD	(90)
+#define DFSO_DOWNDIFFERENCTIAL	(5)
+static int devfreq_simple_ondemand_func(struct devfreq *df,
+					unsigned long *freq)
+{
+	struct devfreq_dev_status stat;
+	int err = df->profile->get_dev_status(df->dev, &stat);
+	unsigned long long a, b;
+
+	if (err)
+		return err;
+
+	/* Set MAX if it's busy enough */
+	if (stat.busy_time * 100 >
+	    stat.total_time * DFSO_UPTHRESHOLD) {
+		*freq = UINT_MAX;
+		return 0;
+	}
+
+	/* Set MAX if we do not know the initial frequency */
+	if (stat.current_frequency == 0) {
+		*freq = UINT_MAX;
+		return 0;
+	}
+
+	/* Keep the current frequency */
+	if (stat.busy_time * 100 >
+	    stat.total_time * (DFSO_UPTHRESHOLD - DFSO_DOWNDIFFERENCTIAL)) {
+		*freq = stat.current_frequency;
+		return 0;
+	}
+
+	/* Set the desired frequency based on the load */
+	a = (unsigned long long) stat.busy_time * stat.current_frequency;
+	b = div_u64(a, stat.total_time);
+	b *= 100;
+	b = div_u64(b, (DFSO_UPTHRESHOLD - DFSO_DOWNDIFFERENCTIAL / 2));
+	*freq = (unsigned long) b;
+
+	return 0;
+}
+
+struct devfreq_governor devfreq_simple_ondemand = {
+	.get_target_freq = devfreq_simple_ondemand_func,
+};
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index 59c8bda..1ec9a40 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -81,6 +81,11 @@ extern int devfreq_add_device(struct device *dev,
 extern int devfreq_remove_device(struct device *dev);
 extern int devfreq_update(struct device *dev);
 extern int devfreq_tickle_device(struct device *dev, unsigned long duration_ms);
+
+extern struct devfreq_governor devfreq_powersave;
+extern struct devfreq_governor devfreq_performance;
+extern struct devfreq_governor devfreq_simple_ondemand;
+
 #else /* !CONFIG_PM_DEVFREQ */
 static int devfreq_add_device(struct device *dev,
 			   struct devfreq_dev_profile *profile,
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 2/3] PM / DEVFREQ: add example governors
  2011-05-27  4:53 [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs MyungJoo Ham
  2011-05-27  4:53 ` [PATCH v3 2/3] PM / DEVFREQ: add example governors MyungJoo Ham
@ 2011-05-27  4:53 ` MyungJoo Ham
  2011-05-27  4:53 ` [PATCH v3 3/3] PM / DEVFREQ: add sysfs interface (including user tickling) MyungJoo Ham
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 35+ messages in thread
From: MyungJoo Ham @ 2011-05-27  4:53 UTC (permalink / raw)
  To: linux-pm
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	Thomas Gleixner

Three CPUFREQ-like governors are provided as examples.

powersave: use the lowest frequency possible. The user (device) should
set the polling_ms as 0 because polling is useless for this governor.

performance: use the highest freqeuncy possible. The user (device)
should set the polling_ms as 0 because polling is useless for this
governor.

simple_ondemand: simplified version of CPUFREQ's ONDEMAND governor.

When a user updates OPP entries (enable/disable/add), OPP framework
automatically notifies DEVFREQ to update operating frequency
accordingly. Thus, DEVFREQ users (device drivers) do not need to update
DEVFREQ manually with OPP entry updates or set polling_ms for powersave
, performance, or any other "static" governors.

Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/base/power/devfreq.c |   69 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/devfreq.h      |    5 +++
 2 files changed, 74 insertions(+), 0 deletions(-)

diff --git a/drivers/base/power/devfreq.c b/drivers/base/power/devfreq.c
index 883d953..7648a94 100644
--- a/drivers/base/power/devfreq.c
+++ b/drivers/base/power/devfreq.c
@@ -395,3 +395,72 @@ static int __init devfreq_init(void)
 	return 0;
 }
 late_initcall(devfreq_init);
+
+static int devfreq_powersave_func(struct devfreq *df,
+				  unsigned long *freq)
+{
+	*freq = 0; /* devfreq_do will run "ceiling" to 0 */
+	return 0;
+}
+
+struct devfreq_governor devfreq_powersave = {
+	.get_target_freq = devfreq_powersave_func,
+};
+
+static int devfreq_performance_func(struct devfreq *df,
+				    unsigned long *freq)
+{
+	*freq = UINT_MAX; /* devfreq_do will run "floor" */
+	return 0;
+}
+
+struct devfreq_governor devfreq_performance = {
+	.get_target_freq = devfreq_performance_func,
+};
+
+/* Constants for DevFreq-Simple-Ondemand (DFSO) */
+#define DFSO_UPTHRESHOLD	(90)
+#define DFSO_DOWNDIFFERENCTIAL	(5)
+static int devfreq_simple_ondemand_func(struct devfreq *df,
+					unsigned long *freq)
+{
+	struct devfreq_dev_status stat;
+	int err = df->profile->get_dev_status(df->dev, &stat);
+	unsigned long long a, b;
+
+	if (err)
+		return err;
+
+	/* Set MAX if it's busy enough */
+	if (stat.busy_time * 100 >
+	    stat.total_time * DFSO_UPTHRESHOLD) {
+		*freq = UINT_MAX;
+		return 0;
+	}
+
+	/* Set MAX if we do not know the initial frequency */
+	if (stat.current_frequency == 0) {
+		*freq = UINT_MAX;
+		return 0;
+	}
+
+	/* Keep the current frequency */
+	if (stat.busy_time * 100 >
+	    stat.total_time * (DFSO_UPTHRESHOLD - DFSO_DOWNDIFFERENCTIAL)) {
+		*freq = stat.current_frequency;
+		return 0;
+	}
+
+	/* Set the desired frequency based on the load */
+	a = (unsigned long long) stat.busy_time * stat.current_frequency;
+	b = div_u64(a, stat.total_time);
+	b *= 100;
+	b = div_u64(b, (DFSO_UPTHRESHOLD - DFSO_DOWNDIFFERENCTIAL / 2));
+	*freq = (unsigned long) b;
+
+	return 0;
+}
+
+struct devfreq_governor devfreq_simple_ondemand = {
+	.get_target_freq = devfreq_simple_ondemand_func,
+};
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index 59c8bda..1ec9a40 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -81,6 +81,11 @@ extern int devfreq_add_device(struct device *dev,
 extern int devfreq_remove_device(struct device *dev);
 extern int devfreq_update(struct device *dev);
 extern int devfreq_tickle_device(struct device *dev, unsigned long duration_ms);
+
+extern struct devfreq_governor devfreq_powersave;
+extern struct devfreq_governor devfreq_performance;
+extern struct devfreq_governor devfreq_simple_ondemand;
+
 #else /* !CONFIG_PM_DEVFREQ */
 static int devfreq_add_device(struct device *dev,
 			   struct devfreq_dev_profile *profile,
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 3/3] PM / DEVFREQ: add sysfs interface (including user tickling)
  2011-05-27  4:53 [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs MyungJoo Ham
  2011-05-27  4:53 ` [PATCH v3 2/3] PM / DEVFREQ: add example governors MyungJoo Ham
  2011-05-27  4:53 ` MyungJoo Ham
@ 2011-05-27  4:53 ` MyungJoo Ham
  2011-07-02 22:13   ` Rafael J. Wysocki
  2011-07-02 22:13   ` Rafael J. Wysocki
  2011-05-27  4:53 ` MyungJoo Ham
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 35+ messages in thread
From: MyungJoo Ham @ 2011-05-27  4:53 UTC (permalink / raw)
  To: linux-pm
  Cc: linux-kernel, Len Brown, Pavel Machek, Rafael J. Wysocki,
	Greg Kroah-Hartman, Kyungmin Park, Jiejing Zhang, Colin Cross,
	Nishanth Menon, Thomas Gleixner, myungjoo.ham

1. System-wide sysfs interface
- tickle_all	R: number of tickle_all execution
		W: tickle all devfreq devices
- min_interval	R: devfreq monitoring base interval in ms
- monitoring	R: shows whether devfreq monitoring is active or
  not.

2. Device specific sysfs interface
- tickle	R: number of tickle execution for the device
		W: tickle the device

Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>

--
Changed from v2
- add ABI entries for devfreq sysfs interface
---
 Documentation/ABI/testing/sysfs-devices-devfreq |   21 ++++
 Documentation/ABI/testing/sysfs-power           |   43 ++++++++
 drivers/base/power/devfreq.c                    |  133 ++++++++++++++++++++++-
 include/linux/devfreq.h                         |    3 +
 4 files changed, 199 insertions(+), 1 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-devices-devfreq

diff --git a/Documentation/ABI/testing/sysfs-devices-devfreq b/Documentation/ABI/testing/sysfs-devices-devfreq
new file mode 100644
index 0000000..7f35a64
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-devices-devfreq
@@ -0,0 +1,21 @@
+What:		/sys/devices/.../devfreq/
+Date:		May 2011
+Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
+Description:
+		The /sys/device/.../devfreq directory will contain files
+		that provide interfaces to DEVFREQ for a specific device.
+
+What:		/sys/devices/.../devfreq/tickle
+Date:		May 2011
+Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
+Description:
+		The /sys/devices/.../devfreq/tickle file allows user space
+		to force the corresponding device to operate at its maximum
+		operable frequency instaneously and temporarily. After a
+		designated duration has passed, the operating frequency returns
+		to normal. When a user reads the tickle entry, it returns
+		the number of tickle executions for the device. When a user
+		writes to the tickle entry with the tickle duration in ms,
+		the effect of device tickling is held for the designated
+		duration. Note that the duration is rounded-up by
+		the value DEVFREQ_INTERVAL defined in devfreq.c
diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power
index b464d12..4d8434b 100644
--- a/Documentation/ABI/testing/sysfs-power
+++ b/Documentation/ABI/testing/sysfs-power
@@ -172,3 +172,46 @@ Description:
 
 		Reading from this file will display the current value, which is
 		set to 1 MB by default.
+
+What:		/sys/power/devfreq/
+Date:		May 2011
+Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
+Description:
+		The /sys/power/devfreq directory will contain files that will
+		provide a unified interface to the DEVFREQ, a generic DVFS
+		(dynamic voltage and frequency scaling) framework.
+
+What:		/sys/power/devfreq/tickle_all
+Date:		May 2011
+Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
+Description:
+		The /sys/power/devfreq/tickle_all file allows user space to
+		force every device with DEVFREQ to operate at the maximum
+		frequency of the device instaneously and temporarily. After
+		a designated delay has passed, the operating frequency returns
+		to normal. If a user reads the tickle_all entry, it returns
+		the number of tickle_all executions. When writing to the
+		tickle_all entry, the user should supply with the duration of
+		tickle in ms (the "designated delay" mentioned before). Then,
+		the effect of tickle_all will hold for the denoted duration.
+		Note that the duration is rounded by the monitoring period
+		defined by DEVFREQ_INTERVAL in /drivers/base/power/devfreq.c.
+
+What:		/sys/power/devfreq/min_interval
+Date:		May 2011
+Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
+Description:
+		The /sys/power/devfreq/min_interval file shows the monitoring
+		period defined by DEVFREQ_INTERVAL in
+		/drivers/base/power/devfreq.c. The duration of device tickling
+		is rounded-up by DEVFREQ_INTERVAL.
+
+What:		/sys/power/devfreq/monitoring
+Date:		May 2011
+Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
+Description:
+		The /sys/power/devfreq/monitoring file shows whether DEVFREQ
+		is periodically monitoring. Periodic monitoring is activated
+		if there is a device that wants periodic monitoring for DVFS or
+		there is a device that is tickled (and the tickling duration is
+		not yet expired).
diff --git a/drivers/base/power/devfreq.c b/drivers/base/power/devfreq.c
index 7648a94..709c138 100644
--- a/drivers/base/power/devfreq.c
+++ b/drivers/base/power/devfreq.c
@@ -40,6 +40,9 @@ static LIST_HEAD(devfreq_list);
 /* Exclusive access to devfreq_list and its elements */
 static DEFINE_MUTEX(devfreq_list_lock);
 
+static struct kobject *devfreq_kobj;
+static struct attribute_group dev_attr_group;
+
 /**
  * find_device_devfreq() - find devfreq struct using device pointer
  * @dev:	device pointer used to lookup device DEVFREQ.
@@ -237,6 +240,8 @@ int devfreq_add_device(struct device *dev, struct devfreq_dev_profile *profile,
 		queue_delayed_work(devfreq_wq, &devfreq_work,
 				   msecs_to_jiffies(DEVFREQ_INTERVAL));
 	}
+
+	sysfs_update_group(&dev->kobj, &dev_attr_group);
 out:
 	mutex_unlock(&devfreq_list_lock);
 
@@ -263,6 +268,8 @@ int devfreq_remove_device(struct device *dev)
 		return -EINVAL;
 	}
 
+	sysfs_remove_group(&dev->kobj, &dev_attr_group);
+
 	list_del(&devfreq->node);
 
 	kfree(devfreq);
@@ -344,7 +351,7 @@ static int _devfreq_tickle_device(struct devfreq *df, unsigned long delay)
 	if (devfreq_wq && !polling) {
 		polling = true;
 		queue_delayed_work(devfreq_wq, &devfreq_work,
-				msecs_to_jiffies(DEVFREQ_INTERVAL));
+				   msecs_to_jiffies(DEVFREQ_INTERVAL));
 	}
 
 	return err;
@@ -382,6 +389,116 @@ int devfreq_tickle_device(struct device *dev, unsigned long duration_ms)
 	return err;
 }
 
+static int num_tickle_all;
+static ssize_t tickle_all(struct device *dev, struct device_attribute *attr,
+			  const char *buf, size_t count)
+{
+	int duration = 0;
+	struct devfreq *tmp;
+	unsigned long delay;
+
+	sscanf(buf, "%d", &duration);
+	if (duration < DEVFREQ_INTERVAL)
+		duration = DEVFREQ_INTERVAL;
+
+	if (unlikely(IS_ERR_OR_NULL(dev))) {
+		pr_err("%s: Invalid parameters\n", __func__);
+		return -EINVAL;
+	}
+
+	delay = DIV_ROUND_UP(duration, DEVFREQ_INTERVAL);
+
+	mutex_lock(&devfreq_list_lock);
+	list_for_each_entry(tmp, &devfreq_list, node) {
+		_devfreq_tickle_device(tmp, delay);
+	}
+	mutex_unlock(&devfreq_list_lock);
+
+	num_tickle_all++;
+	return count;
+}
+
+static ssize_t show_num_tickle_all(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%d\n", num_tickle_all);
+}
+
+static ssize_t show_min_interval(struct device *dev,
+				 struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%d\n", DEVFREQ_INTERVAL);
+}
+
+static ssize_t show_monitoring(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%d\n", monitoring ? 1 : 0);
+}
+
+static DEVICE_ATTR(tickle_all, 0644, show_num_tickle_all, tickle_all);
+static DEVICE_ATTR(min_interval, 0444, show_min_interval, NULL);
+static DEVICE_ATTR(monitoring, 0444, show_monitoring, NULL);
+static struct attribute *devfreq_entries[] = {
+	&dev_attr_tickle_all.attr,
+	&dev_attr_min_interval.attr,
+	&dev_attr_monitoring.attr,
+	NULL,
+};
+static struct attribute_group devfreq_attr_group = {
+	.name	= NULL,
+	.attrs	= devfreq_entries,
+};
+
+static ssize_t tickle(struct device *dev, struct device_attribute *attr,
+		      const char *buf, size_t count)
+{
+	int duration;
+	struct devfreq *df;
+	unsigned long delay;
+
+	sscanf(buf, "%d", &duration);
+	if (duration < DEVFREQ_INTERVAL)
+		duration = DEVFREQ_INTERVAL;
+
+	if (unlikely(IS_ERR_OR_NULL(dev))) {
+		pr_err("%s: Invalid parameters\n", __func__);
+		return -EINVAL;
+	}
+
+	delay = DIV_ROUND_UP(duration, DEVFREQ_INTERVAL);
+
+	mutex_lock(&devfreq_list_lock);
+	df = find_device_devfreq(dev);
+	_devfreq_tickle_device(df, delay);
+	mutex_unlock(&devfreq_list_lock);
+
+	return count;
+}
+
+static ssize_t show_num_tickle(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	struct devfreq *df;
+
+	df = find_device_devfreq(dev);
+
+	if (!IS_ERR(df))
+		return sprintf(buf, "%d\n", df->num_tickle);
+
+	return PTR_ERR(df);
+}
+
+static DEVICE_ATTR(tickle, 0644, show_num_tickle, tickle);
+static struct attribute *dev_entries[] = {
+	&dev_attr_tickle.attr,
+	NULL,
+};
+static struct attribute_group dev_attr_group = {
+	.name	= "devfreq",
+	.attrs	= dev_entries,
+};
+
 static int __init devfreq_init(void)
 {
 	mutex_lock(&devfreq_list_lock);
@@ -389,6 +506,20 @@ static int __init devfreq_init(void)
 	polling = false;
 	devfreq_wq = create_freezable_workqueue("devfreq_wq");
 	INIT_DELAYED_WORK_DEFERRABLE(&devfreq_work, devfreq_monitor);
+
+#ifdef CONFIG_PM
+	/* Create sysfs */
+	devfreq_kobj = kobject_create_and_add("devfreq", power_kobj);
+	if (!devfreq_kobj) {
+		pr_err("Unable to create DEVFREQ kobject.\n");
+		goto out;
+	}
+	if (sysfs_create_group(devfreq_kobj, &devfreq_attr_group)) {
+		pr_err("Unable to create DEVFREQ sysfs entries.\n");
+		goto out;
+	}
+#endif
+out:
 	mutex_unlock(&devfreq_list_lock);
 
 	devfreq_monitor(&devfreq_work.work);
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index 1ec9a40..69334e7 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -59,6 +59,7 @@ struct devfreq_governor {
  *		at each executino of devfreq_monitor, tickle is decremented.
  *		User may tickle a device-devfreq in order to set maximum
  *		frequency instaneously with some guaranteed duration.
+ * @num_tickle	number of tickle calls.
  *
  * This structure stores the DEVFREQ information for a give device.
  */
@@ -72,6 +73,8 @@ struct devfreq {
 	unsigned long previous_freq;
 	unsigned int next_polling;
 	unsigned int tickle;
+
+	unsigned int num_tickle;
 };
 
 #if defined(CONFIG_PM_DEVFREQ)
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 3/3] PM / DEVFREQ: add sysfs interface (including user tickling)
  2011-05-27  4:53 [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs MyungJoo Ham
                   ` (2 preceding siblings ...)
  2011-05-27  4:53 ` [PATCH v3 3/3] PM / DEVFREQ: add sysfs interface (including user tickling) MyungJoo Ham
@ 2011-05-27  4:53 ` MyungJoo Ham
  2011-05-28  8:57 ` [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs Nishanth Menon
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 35+ messages in thread
From: MyungJoo Ham @ 2011-05-27  4:53 UTC (permalink / raw)
  To: linux-pm
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	Thomas Gleixner

1. System-wide sysfs interface
- tickle_all	R: number of tickle_all execution
		W: tickle all devfreq devices
- min_interval	R: devfreq monitoring base interval in ms
- monitoring	R: shows whether devfreq monitoring is active or
  not.

2. Device specific sysfs interface
- tickle	R: number of tickle execution for the device
		W: tickle the device

Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>

--
Changed from v2
- add ABI entries for devfreq sysfs interface
---
 Documentation/ABI/testing/sysfs-devices-devfreq |   21 ++++
 Documentation/ABI/testing/sysfs-power           |   43 ++++++++
 drivers/base/power/devfreq.c                    |  133 ++++++++++++++++++++++-
 include/linux/devfreq.h                         |    3 +
 4 files changed, 199 insertions(+), 1 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-devices-devfreq

diff --git a/Documentation/ABI/testing/sysfs-devices-devfreq b/Documentation/ABI/testing/sysfs-devices-devfreq
new file mode 100644
index 0000000..7f35a64
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-devices-devfreq
@@ -0,0 +1,21 @@
+What:		/sys/devices/.../devfreq/
+Date:		May 2011
+Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
+Description:
+		The /sys/device/.../devfreq directory will contain files
+		that provide interfaces to DEVFREQ for a specific device.
+
+What:		/sys/devices/.../devfreq/tickle
+Date:		May 2011
+Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
+Description:
+		The /sys/devices/.../devfreq/tickle file allows user space
+		to force the corresponding device to operate at its maximum
+		operable frequency instaneously and temporarily. After a
+		designated duration has passed, the operating frequency returns
+		to normal. When a user reads the tickle entry, it returns
+		the number of tickle executions for the device. When a user
+		writes to the tickle entry with the tickle duration in ms,
+		the effect of device tickling is held for the designated
+		duration. Note that the duration is rounded-up by
+		the value DEVFREQ_INTERVAL defined in devfreq.c
diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power
index b464d12..4d8434b 100644
--- a/Documentation/ABI/testing/sysfs-power
+++ b/Documentation/ABI/testing/sysfs-power
@@ -172,3 +172,46 @@ Description:
 
 		Reading from this file will display the current value, which is
 		set to 1 MB by default.
+
+What:		/sys/power/devfreq/
+Date:		May 2011
+Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
+Description:
+		The /sys/power/devfreq directory will contain files that will
+		provide a unified interface to the DEVFREQ, a generic DVFS
+		(dynamic voltage and frequency scaling) framework.
+
+What:		/sys/power/devfreq/tickle_all
+Date:		May 2011
+Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
+Description:
+		The /sys/power/devfreq/tickle_all file allows user space to
+		force every device with DEVFREQ to operate at the maximum
+		frequency of the device instaneously and temporarily. After
+		a designated delay has passed, the operating frequency returns
+		to normal. If a user reads the tickle_all entry, it returns
+		the number of tickle_all executions. When writing to the
+		tickle_all entry, the user should supply with the duration of
+		tickle in ms (the "designated delay" mentioned before). Then,
+		the effect of tickle_all will hold for the denoted duration.
+		Note that the duration is rounded by the monitoring period
+		defined by DEVFREQ_INTERVAL in /drivers/base/power/devfreq.c.
+
+What:		/sys/power/devfreq/min_interval
+Date:		May 2011
+Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
+Description:
+		The /sys/power/devfreq/min_interval file shows the monitoring
+		period defined by DEVFREQ_INTERVAL in
+		/drivers/base/power/devfreq.c. The duration of device tickling
+		is rounded-up by DEVFREQ_INTERVAL.
+
+What:		/sys/power/devfreq/monitoring
+Date:		May 2011
+Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
+Description:
+		The /sys/power/devfreq/monitoring file shows whether DEVFREQ
+		is periodically monitoring. Periodic monitoring is activated
+		if there is a device that wants periodic monitoring for DVFS or
+		there is a device that is tickled (and the tickling duration is
+		not yet expired).
diff --git a/drivers/base/power/devfreq.c b/drivers/base/power/devfreq.c
index 7648a94..709c138 100644
--- a/drivers/base/power/devfreq.c
+++ b/drivers/base/power/devfreq.c
@@ -40,6 +40,9 @@ static LIST_HEAD(devfreq_list);
 /* Exclusive access to devfreq_list and its elements */
 static DEFINE_MUTEX(devfreq_list_lock);
 
+static struct kobject *devfreq_kobj;
+static struct attribute_group dev_attr_group;
+
 /**
  * find_device_devfreq() - find devfreq struct using device pointer
  * @dev:	device pointer used to lookup device DEVFREQ.
@@ -237,6 +240,8 @@ int devfreq_add_device(struct device *dev, struct devfreq_dev_profile *profile,
 		queue_delayed_work(devfreq_wq, &devfreq_work,
 				   msecs_to_jiffies(DEVFREQ_INTERVAL));
 	}
+
+	sysfs_update_group(&dev->kobj, &dev_attr_group);
 out:
 	mutex_unlock(&devfreq_list_lock);
 
@@ -263,6 +268,8 @@ int devfreq_remove_device(struct device *dev)
 		return -EINVAL;
 	}
 
+	sysfs_remove_group(&dev->kobj, &dev_attr_group);
+
 	list_del(&devfreq->node);
 
 	kfree(devfreq);
@@ -344,7 +351,7 @@ static int _devfreq_tickle_device(struct devfreq *df, unsigned long delay)
 	if (devfreq_wq && !polling) {
 		polling = true;
 		queue_delayed_work(devfreq_wq, &devfreq_work,
-				msecs_to_jiffies(DEVFREQ_INTERVAL));
+				   msecs_to_jiffies(DEVFREQ_INTERVAL));
 	}
 
 	return err;
@@ -382,6 +389,116 @@ int devfreq_tickle_device(struct device *dev, unsigned long duration_ms)
 	return err;
 }
 
+static int num_tickle_all;
+static ssize_t tickle_all(struct device *dev, struct device_attribute *attr,
+			  const char *buf, size_t count)
+{
+	int duration = 0;
+	struct devfreq *tmp;
+	unsigned long delay;
+
+	sscanf(buf, "%d", &duration);
+	if (duration < DEVFREQ_INTERVAL)
+		duration = DEVFREQ_INTERVAL;
+
+	if (unlikely(IS_ERR_OR_NULL(dev))) {
+		pr_err("%s: Invalid parameters\n", __func__);
+		return -EINVAL;
+	}
+
+	delay = DIV_ROUND_UP(duration, DEVFREQ_INTERVAL);
+
+	mutex_lock(&devfreq_list_lock);
+	list_for_each_entry(tmp, &devfreq_list, node) {
+		_devfreq_tickle_device(tmp, delay);
+	}
+	mutex_unlock(&devfreq_list_lock);
+
+	num_tickle_all++;
+	return count;
+}
+
+static ssize_t show_num_tickle_all(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%d\n", num_tickle_all);
+}
+
+static ssize_t show_min_interval(struct device *dev,
+				 struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%d\n", DEVFREQ_INTERVAL);
+}
+
+static ssize_t show_monitoring(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%d\n", monitoring ? 1 : 0);
+}
+
+static DEVICE_ATTR(tickle_all, 0644, show_num_tickle_all, tickle_all);
+static DEVICE_ATTR(min_interval, 0444, show_min_interval, NULL);
+static DEVICE_ATTR(monitoring, 0444, show_monitoring, NULL);
+static struct attribute *devfreq_entries[] = {
+	&dev_attr_tickle_all.attr,
+	&dev_attr_min_interval.attr,
+	&dev_attr_monitoring.attr,
+	NULL,
+};
+static struct attribute_group devfreq_attr_group = {
+	.name	= NULL,
+	.attrs	= devfreq_entries,
+};
+
+static ssize_t tickle(struct device *dev, struct device_attribute *attr,
+		      const char *buf, size_t count)
+{
+	int duration;
+	struct devfreq *df;
+	unsigned long delay;
+
+	sscanf(buf, "%d", &duration);
+	if (duration < DEVFREQ_INTERVAL)
+		duration = DEVFREQ_INTERVAL;
+
+	if (unlikely(IS_ERR_OR_NULL(dev))) {
+		pr_err("%s: Invalid parameters\n", __func__);
+		return -EINVAL;
+	}
+
+	delay = DIV_ROUND_UP(duration, DEVFREQ_INTERVAL);
+
+	mutex_lock(&devfreq_list_lock);
+	df = find_device_devfreq(dev);
+	_devfreq_tickle_device(df, delay);
+	mutex_unlock(&devfreq_list_lock);
+
+	return count;
+}
+
+static ssize_t show_num_tickle(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	struct devfreq *df;
+
+	df = find_device_devfreq(dev);
+
+	if (!IS_ERR(df))
+		return sprintf(buf, "%d\n", df->num_tickle);
+
+	return PTR_ERR(df);
+}
+
+static DEVICE_ATTR(tickle, 0644, show_num_tickle, tickle);
+static struct attribute *dev_entries[] = {
+	&dev_attr_tickle.attr,
+	NULL,
+};
+static struct attribute_group dev_attr_group = {
+	.name	= "devfreq",
+	.attrs	= dev_entries,
+};
+
 static int __init devfreq_init(void)
 {
 	mutex_lock(&devfreq_list_lock);
@@ -389,6 +506,20 @@ static int __init devfreq_init(void)
 	polling = false;
 	devfreq_wq = create_freezable_workqueue("devfreq_wq");
 	INIT_DELAYED_WORK_DEFERRABLE(&devfreq_work, devfreq_monitor);
+
+#ifdef CONFIG_PM
+	/* Create sysfs */
+	devfreq_kobj = kobject_create_and_add("devfreq", power_kobj);
+	if (!devfreq_kobj) {
+		pr_err("Unable to create DEVFREQ kobject.\n");
+		goto out;
+	}
+	if (sysfs_create_group(devfreq_kobj, &devfreq_attr_group)) {
+		pr_err("Unable to create DEVFREQ sysfs entries.\n");
+		goto out;
+	}
+#endif
+out:
 	mutex_unlock(&devfreq_list_lock);
 
 	devfreq_monitor(&devfreq_work.work);
diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
index 1ec9a40..69334e7 100644
--- a/include/linux/devfreq.h
+++ b/include/linux/devfreq.h
@@ -59,6 +59,7 @@ struct devfreq_governor {
  *		at each executino of devfreq_monitor, tickle is decremented.
  *		User may tickle a device-devfreq in order to set maximum
  *		frequency instaneously with some guaranteed duration.
+ * @num_tickle	number of tickle calls.
  *
  * This structure stores the DEVFREQ information for a give device.
  */
@@ -72,6 +73,8 @@ struct devfreq {
 	unsigned long previous_freq;
 	unsigned int next_polling;
 	unsigned int tickle;
+
+	unsigned int num_tickle;
 };
 
 #if defined(CONFIG_PM_DEVFREQ)
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
  2011-05-27  4:53 [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs MyungJoo Ham
                   ` (3 preceding siblings ...)
  2011-05-27  4:53 ` MyungJoo Ham
@ 2011-05-28  8:57 ` Nishanth Menon
  2011-05-30  5:03   ` MyungJoo Ham
  2011-05-30  5:03   ` MyungJoo Ham
  2011-05-28  8:57 ` Nishanth Menon
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 35+ messages in thread
From: Nishanth Menon @ 2011-05-28  8:57 UTC (permalink / raw)
  To: MyungJoo Ham
  Cc: linux-pm, linux-kernel, Len Brown, Pavel Machek,
	Rafael J. Wysocki, Greg Kroah-Hartman, Kyungmin Park,
	Jiejing Zhang, Colin Cross, Thomas Gleixner, myungjoo.ham

On 13:53-20110527, MyungJoo Ham wrote:
[..]
> diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c
> index 56a6899..819c1b3 100644
> --- a/drivers/base/power/opp.c
> +++ b/drivers/base/power/opp.c
> @@ -21,6 +21,7 @@
>  #include <linux/rculist.h>
>  #include <linux/rcupdate.h>
>  #include <linux/opp.h>
> +#include <linux/devfreq.h>
>  
>  /*
>   * Internal data structure organization with the OPP layer library is as
> @@ -428,6 +429,11 @@ int opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
>  	list_add_rcu(&new_opp->node, head);
>  	mutex_unlock(&dev_opp_list_lock);
>  
> +	/*
> +	 * Notify generic dvfs for the change and ignore error
> +	 * because the device may not have a devfreq entry
> +	 */
> +	devfreq_update(dev);
I think I understand your thought about notifiers here - we had some
initial thought that we may need notifiers, for e.g. clk change
notifiers which could implement enable/disable on a dependent device,
thermal management etc.. so far, we have'nt had a need to modify the
lowest data store layer to handle this. In this case, I think you'd
like a notifier mechanism in general for opp_add,enable or disable in
the opp library. However, with this change, an opp_add now means:
a) opp_add now depends on devfreq_update() to be part of it's integral
operation - I think the devfreq should maintain it's own notifier
mechanisms and not make OPP library do the notification trigger job of
devfreq.
b) By ignoring the error, how will the caller of opp_add now know if the
failures were real - e.g. some device failed to act on the notification
and the overall opp_add operation failed? And why should OPP library do
recovery for on behalf of devfreq?
c) How will you ensure the lock accuracy - given that you need to keep
to the constraints of opp_add/opp_set_availability here? For example:
devfreq_do is called from devfreq_monitor(workqueue) and devfreq_update
in addition to having to call a bunch of function pointers whose
implementation you do not know, having a function that ensures security
of it's data, handles all lock conditions that is imposed differently by
add,enable and disable is not visible in this implementation.
c) How about considering the usage of OPP library with an SoC cpufreq
implementation say for MPU AND using the same SoC using devfreq for
controlling some devices like MALI in your example simultaneously? Lets
say the thermal policy manager implementation for some reason did an
opp_disable of a higher OPP of MPU - devfreq_update gets called with the
device(mpu_dev).. devfreq_update needs to now invoke it's validation
path and fails correctly - Vs devfreq_update being called by some module
which actually uses devfreq and the update failed - How does one
differentiate between the two?

just my 2 cents, I think these kind of notifiers should probably
stay out of opp library OR notifiers be introduced in a generic and safe
fashion.

>  	return 0;
>  }
>  
> @@ -512,6 +518,9 @@ unlock:
>  	mutex_unlock(&dev_opp_list_lock);
>  out:
>  	kfree(new_opp);
> +
> +	/* Notify generic dvfs for the change and ignore error */
> +	devfreq_update(dev);
Another place where I am confused - how does devfreq_update know in what
context is it being called is the OPP getting enabled/disabled/added -
devfreq_update has to do some additional work to make it's own decision
- it is almost as if it uses devfreq_update as a trigger mechanism to reevalute
the entire decision tree - without using information of the context
opp_enable/disable/add has to maybe make better call. if this
information is not needed, maybe some other mechanism implemented by
devfreq layer might suffice..

>  	return r;
>  }

[...]
-- 
Regards,
Nishanth Menon

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
  2011-05-27  4:53 [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs MyungJoo Ham
                   ` (4 preceding siblings ...)
  2011-05-28  8:57 ` [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs Nishanth Menon
@ 2011-05-28  8:57 ` Nishanth Menon
  2011-07-02 21:56 ` Rafael J. Wysocki
  2011-07-02 21:56 ` Rafael J. Wysocki
  7 siblings, 0 replies; 35+ messages in thread
From: Nishanth Menon @ 2011-05-28  8:57 UTC (permalink / raw)
  To: MyungJoo Ham
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	linux-pm, Thomas Gleixner

On 13:53-20110527, MyungJoo Ham wrote:
[..]
> diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c
> index 56a6899..819c1b3 100644
> --- a/drivers/base/power/opp.c
> +++ b/drivers/base/power/opp.c
> @@ -21,6 +21,7 @@
>  #include <linux/rculist.h>
>  #include <linux/rcupdate.h>
>  #include <linux/opp.h>
> +#include <linux/devfreq.h>
>  
>  /*
>   * Internal data structure organization with the OPP layer library is as
> @@ -428,6 +429,11 @@ int opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
>  	list_add_rcu(&new_opp->node, head);
>  	mutex_unlock(&dev_opp_list_lock);
>  
> +	/*
> +	 * Notify generic dvfs for the change and ignore error
> +	 * because the device may not have a devfreq entry
> +	 */
> +	devfreq_update(dev);
I think I understand your thought about notifiers here - we had some
initial thought that we may need notifiers, for e.g. clk change
notifiers which could implement enable/disable on a dependent device,
thermal management etc.. so far, we have'nt had a need to modify the
lowest data store layer to handle this. In this case, I think you'd
like a notifier mechanism in general for opp_add,enable or disable in
the opp library. However, with this change, an opp_add now means:
a) opp_add now depends on devfreq_update() to be part of it's integral
operation - I think the devfreq should maintain it's own notifier
mechanisms and not make OPP library do the notification trigger job of
devfreq.
b) By ignoring the error, how will the caller of opp_add now know if the
failures were real - e.g. some device failed to act on the notification
and the overall opp_add operation failed? And why should OPP library do
recovery for on behalf of devfreq?
c) How will you ensure the lock accuracy - given that you need to keep
to the constraints of opp_add/opp_set_availability here? For example:
devfreq_do is called from devfreq_monitor(workqueue) and devfreq_update
in addition to having to call a bunch of function pointers whose
implementation you do not know, having a function that ensures security
of it's data, handles all lock conditions that is imposed differently by
add,enable and disable is not visible in this implementation.
c) How about considering the usage of OPP library with an SoC cpufreq
implementation say for MPU AND using the same SoC using devfreq for
controlling some devices like MALI in your example simultaneously? Lets
say the thermal policy manager implementation for some reason did an
opp_disable of a higher OPP of MPU - devfreq_update gets called with the
device(mpu_dev).. devfreq_update needs to now invoke it's validation
path and fails correctly - Vs devfreq_update being called by some module
which actually uses devfreq and the update failed - How does one
differentiate between the two?

just my 2 cents, I think these kind of notifiers should probably
stay out of opp library OR notifiers be introduced in a generic and safe
fashion.

>  	return 0;
>  }
>  
> @@ -512,6 +518,9 @@ unlock:
>  	mutex_unlock(&dev_opp_list_lock);
>  out:
>  	kfree(new_opp);
> +
> +	/* Notify generic dvfs for the change and ignore error */
> +	devfreq_update(dev);
Another place where I am confused - how does devfreq_update know in what
context is it being called is the OPP getting enabled/disabled/added -
devfreq_update has to do some additional work to make it's own decision
- it is almost as if it uses devfreq_update as a trigger mechanism to reevalute
the entire decision tree - without using information of the context
opp_enable/disable/add has to maybe make better call. if this
information is not needed, maybe some other mechanism implemented by
devfreq layer might suffice..

>  	return r;
>  }

[...]
-- 
Regards,
Nishanth Menon

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
  2011-05-28  8:57 ` [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs Nishanth Menon
@ 2011-05-30  5:03   ` MyungJoo Ham
  2011-06-16 21:11     ` Rafael J. Wysocki
  2011-06-16 21:11     ` Rafael J. Wysocki
  2011-05-30  5:03   ` MyungJoo Ham
  1 sibling, 2 replies; 35+ messages in thread
From: MyungJoo Ham @ 2011-05-30  5:03 UTC (permalink / raw)
  To: Nishanth Menon
  Cc: linux-pm, linux-kernel, Len Brown, Pavel Machek,
	Rafael J. Wysocki, Greg Kroah-Hartman, Kyungmin Park,
	Jiejing Zhang, Colin Cross, Thomas Gleixner, MyungJoo Ham

On Sat, May 28, 2011 at 5:57 PM, Nishanth Menon <nm@ti.com> wrote:
> On 13:53-20110527, MyungJoo Ham wrote:
> [..]
>> diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c
>> index 56a6899..819c1b3 100644
>> --- a/drivers/base/power/opp.c
>> +++ b/drivers/base/power/opp.c
>> @@ -21,6 +21,7 @@
>>  #include <linux/rculist.h>
>>  #include <linux/rcupdate.h>
>>  #include <linux/opp.h>
>> +#include <linux/devfreq.h>
>>
>>  /*
>>   * Internal data structure organization with the OPP layer library is as
>> @@ -428,6 +429,11 @@ int opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
>>       list_add_rcu(&new_opp->node, head);
>>       mutex_unlock(&dev_opp_list_lock);
>>
>> +     /*
>> +      * Notify generic dvfs for the change and ignore error
>> +      * because the device may not have a devfreq entry
>> +      */
>> +     devfreq_update(dev);
> I think I understand your thought about notifiers here - we had some
> initial thought that we may need notifiers, for e.g. clk change
> notifiers which could implement enable/disable on a dependent device,
> thermal management etc.. so far, we have'nt had a need to modify the
> lowest data store layer to handle this. In this case, I think you'd
> like a notifier mechanism in general for opp_add,enable or disable in
> the opp library.

Yes, I wanted a notifier machnaism called by OPP for any potential
changes in available frequencies. For that purpose, I've added
devfreq_update() because, for now, DEVFREQ is the only one that needs
such a notifier and a notifier has some overhead. If there are
multiple entities that require notifications from OPP for such
changes, we'd need notifiers maintained by OPP.

> However, with this change, an opp_add now means:
> a) opp_add now depends on devfreq_update() to be part of it's integral
> operation - I think the devfreq should maintain it's own notifier
> mechanisms and not make OPP library do the notification trigger job of
> devfreq.

To let the entities depending on the list of available frequencies be
notified, I think it would be appropriate for OPP to have a notifier
chain and call the notifier chain whenever there is a change in the
list. In that way, we can always add and remove the depending entities
(including DEVFREQ) to and from OPP's. Otherwise, we'd need OPP users
that call OPP functions to call the notifier chain when it appears
that the call will change the list.

As long as the updates in available frequencies affect other entities,
OPP library should be able to notify the others whenever such changes
occur. In this patch, I've used devfreq_update() directly because we
do not have other entities, yet. However, as you've  mentioned, there
will be others, so I'm willing to implement notifier in OPP if other
people also agree.

> b) By ignoring the error, how will the caller of opp_add now know if the
> failures were real - e.g. some device failed to act on the notification
> and the overall opp_add operation failed? And why should OPP library do
> recovery for on behalf of devfreq?

Um... OPP does not need to do any recovery for errors from DEVFREQ.
Other than errors from find_device_devfreq() in devfreq_update(), the
errors seem to be better returned somehow to the caller of opp_add().
However, would it be fine to let opp_add (and other
frequency-availability changing functions) return errors returned from
a notifier? (either devfreq_update() or srcu_notifier_call_chain())

> c) How will you ensure the lock accuracy - given that you need to keep
> to the constraints of opp_add/opp_set_availability here? For example:
> devfreq_do is called from devfreq_monitor(workqueue) and devfreq_update
> in addition to having to call a bunch of function pointers whose
> implementation you do not know, having a function that ensures security
> of it's data, handles all lock conditions that is imposed differently by
> add,enable and disable is not visible in this implementation.

devfreq_update() waits for devfreq_monitor() with devfreq_list_lock.
Both has devfreq_list_lock locked during their operations.

> c) How about considering the usage of OPP library with an SoC cpufreq
> implementation say for MPU AND using the same SoC using devfreq for
> controlling some devices like MALI in your example simultaneously? Lets
> say the thermal policy manager implementation for some reason did an
> opp_disable of a higher OPP of MPU - devfreq_update gets called with the
> device(mpu_dev).. devfreq_update needs to now invoke it's validation
> path and fails correctly - Vs devfreq_update being called by some module
> which actually uses devfreq and the update failed - How does one
> differentiate between the two?

When OPP is being used by CPUFREQ w/ CPU and by MALI, the OPP list
should be attached to the two devices independently.
In other words, the frequency of CPU should be controlled
independently from the frequency of OPP. If both CPU and MALI should
use the same frequency, there is no point to implement devfreq for
MALI. However, if the condition is not fully-dependent; e.g.,
Freq(MALI) <= Freq(CPU), we can implement two instances of OPP for
each of CPU and MALI and let the "target" function supplied to CPU to
run opp_enable/disable and devfreq_update() to MALI before adjusting
the frequency of itself.

When we are implementing another entity that controls OPP/DEVFREQ such
as the thermal policy manager, it should tackle both CPU and MALI w/
OPPs at the same time if they are fully independent (i.e., the
frequency of MALI is not affected by the frequency of CPU). If not,
the same things (discussed in the previous paragraph) are applied.

In the two examples you've mentioned, the first one is about the
implementation of the current patch (let thermal manager handle OPP
and OPP handle devfreq_update) and the another is when DEVFREQ calls
are implemented through thermal manager (let thermal manager handle
both OPP and devfreq_update), right?

For the two cases, I think that the first approach is more
appropriate. We will probably have more than just a thermal manager to
handle OPP (and devfreq that depends on OPP). The first approach
allows users to declare which frequencies are available and to let the
underlying structure do their work without intervention from users
(those who decide which frequencies are available). With the second
approach, we need to enforce every frequency affecting entity to
understand underlying frequency dependencies and the behavior of
devfreq, not jst to understand the behavior of the device according to
the frequency. DEVFREQ is going to be handled and implemented by the
device driver of the device controlled by the instance of DEVFREQ. By
using the first approach, we can prevent affecting entities other than
the device driver using DEVFREQ from thinking about DEVFREQ of that
device.


>
> just my 2 cents, I think these kind of notifiers should probably
> stay out of opp library OR notifiers be introduced in a generic and safe
> fashion.

So, you want either

to call devfreq_update from where opp_add/enable/disable/... are called.

or

to use notifier in OPP?


Well, the first one would become the source of messy problems as
mentioned earlier. The second one has some overhead, but it's a
general approach and does not incur the issues of the first approach.

Therefore, I prefer either to keep the initial approach regarding
devfreq_update() or to use a generic notifier at OPP (probably, SRCU
notifier chains?).

How about using a SRCU_NOTIFIER at OPP?

The notifier.h says that SRCU has low overhead generally and the most
overhead of SRCU comes from "remove srcu" (probably when "device" is
being removed), which won't happen a lot with OPP.

>
>>       return 0;
>>  }
>>
>> @@ -512,6 +518,9 @@ unlock:
>>       mutex_unlock(&dev_opp_list_lock);
>>  out:
>>       kfree(new_opp);
>> +
>> +     /* Notify generic dvfs for the change and ignore error */
>> +     devfreq_update(dev);
> Another place where I am confused - how does devfreq_update know in what
> context is it being called is the OPP getting enabled/disabled/added -
> devfreq_update has to do some additional work to make it's own decision
> - it is almost as if it uses devfreq_update as a trigger mechanism to reevalute
> the entire decision tree - without using information of the context
> opp_enable/disable/add has to maybe make better call. if this
> information is not needed, maybe some other mechanism implemented by
> devfreq layer might suffice..

No, DEVFREQ does not require the context (OPP being
enabled/disabled/...) with the devfreq_update() calls. It is
essentially a "notify devfreq to reevaluate" and DEVFREQ reevaluates
based on the available frequencies.

Thus, a plain srcu_notifier_call_chain() should work for
DEVFREQ/devfreq_update from OPP.

>
>>       return r;
>>  }
>
> [...]
> --
> Regards,
> Nishanth Menon
>


Thank you for your comments, Nishanth.


Cheers!
- MyungJoo

-- 
MyungJoo Ham (함명주), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
  2011-05-28  8:57 ` [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs Nishanth Menon
  2011-05-30  5:03   ` MyungJoo Ham
@ 2011-05-30  5:03   ` MyungJoo Ham
  1 sibling, 0 replies; 35+ messages in thread
From: MyungJoo Ham @ 2011-05-30  5:03 UTC (permalink / raw)
  To: Nishanth Menon
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	linux-pm, Thomas Gleixner

On Sat, May 28, 2011 at 5:57 PM, Nishanth Menon <nm@ti.com> wrote:
> On 13:53-20110527, MyungJoo Ham wrote:
> [..]
>> diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c
>> index 56a6899..819c1b3 100644
>> --- a/drivers/base/power/opp.c
>> +++ b/drivers/base/power/opp.c
>> @@ -21,6 +21,7 @@
>>  #include <linux/rculist.h>
>>  #include <linux/rcupdate.h>
>>  #include <linux/opp.h>
>> +#include <linux/devfreq.h>
>>
>>  /*
>>   * Internal data structure organization with the OPP layer library is as
>> @@ -428,6 +429,11 @@ int opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
>>       list_add_rcu(&new_opp->node, head);
>>       mutex_unlock(&dev_opp_list_lock);
>>
>> +     /*
>> +      * Notify generic dvfs for the change and ignore error
>> +      * because the device may not have a devfreq entry
>> +      */
>> +     devfreq_update(dev);
> I think I understand your thought about notifiers here - we had some
> initial thought that we may need notifiers, for e.g. clk change
> notifiers which could implement enable/disable on a dependent device,
> thermal management etc.. so far, we have'nt had a need to modify the
> lowest data store layer to handle this. In this case, I think you'd
> like a notifier mechanism in general for opp_add,enable or disable in
> the opp library.

Yes, I wanted a notifier machnaism called by OPP for any potential
changes in available frequencies. For that purpose, I've added
devfreq_update() because, for now, DEVFREQ is the only one that needs
such a notifier and a notifier has some overhead. If there are
multiple entities that require notifications from OPP for such
changes, we'd need notifiers maintained by OPP.

> However, with this change, an opp_add now means:
> a) opp_add now depends on devfreq_update() to be part of it's integral
> operation - I think the devfreq should maintain it's own notifier
> mechanisms and not make OPP library do the notification trigger job of
> devfreq.

To let the entities depending on the list of available frequencies be
notified, I think it would be appropriate for OPP to have a notifier
chain and call the notifier chain whenever there is a change in the
list. In that way, we can always add and remove the depending entities
(including DEVFREQ) to and from OPP's. Otherwise, we'd need OPP users
that call OPP functions to call the notifier chain when it appears
that the call will change the list.

As long as the updates in available frequencies affect other entities,
OPP library should be able to notify the others whenever such changes
occur. In this patch, I've used devfreq_update() directly because we
do not have other entities, yet. However, as you've  mentioned, there
will be others, so I'm willing to implement notifier in OPP if other
people also agree.

> b) By ignoring the error, how will the caller of opp_add now know if the
> failures were real - e.g. some device failed to act on the notification
> and the overall opp_add operation failed? And why should OPP library do
> recovery for on behalf of devfreq?

Um... OPP does not need to do any recovery for errors from DEVFREQ.
Other than errors from find_device_devfreq() in devfreq_update(), the
errors seem to be better returned somehow to the caller of opp_add().
However, would it be fine to let opp_add (and other
frequency-availability changing functions) return errors returned from
a notifier? (either devfreq_update() or srcu_notifier_call_chain())

> c) How will you ensure the lock accuracy - given that you need to keep
> to the constraints of opp_add/opp_set_availability here? For example:
> devfreq_do is called from devfreq_monitor(workqueue) and devfreq_update
> in addition to having to call a bunch of function pointers whose
> implementation you do not know, having a function that ensures security
> of it's data, handles all lock conditions that is imposed differently by
> add,enable and disable is not visible in this implementation.

devfreq_update() waits for devfreq_monitor() with devfreq_list_lock.
Both has devfreq_list_lock locked during their operations.

> c) How about considering the usage of OPP library with an SoC cpufreq
> implementation say for MPU AND using the same SoC using devfreq for
> controlling some devices like MALI in your example simultaneously? Lets
> say the thermal policy manager implementation for some reason did an
> opp_disable of a higher OPP of MPU - devfreq_update gets called with the
> device(mpu_dev).. devfreq_update needs to now invoke it's validation
> path and fails correctly - Vs devfreq_update being called by some module
> which actually uses devfreq and the update failed - How does one
> differentiate between the two?

When OPP is being used by CPUFREQ w/ CPU and by MALI, the OPP list
should be attached to the two devices independently.
In other words, the frequency of CPU should be controlled
independently from the frequency of OPP. If both CPU and MALI should
use the same frequency, there is no point to implement devfreq for
MALI. However, if the condition is not fully-dependent; e.g.,
Freq(MALI) <= Freq(CPU), we can implement two instances of OPP for
each of CPU and MALI and let the "target" function supplied to CPU to
run opp_enable/disable and devfreq_update() to MALI before adjusting
the frequency of itself.

When we are implementing another entity that controls OPP/DEVFREQ such
as the thermal policy manager, it should tackle both CPU and MALI w/
OPPs at the same time if they are fully independent (i.e., the
frequency of MALI is not affected by the frequency of CPU). If not,
the same things (discussed in the previous paragraph) are applied.

In the two examples you've mentioned, the first one is about the
implementation of the current patch (let thermal manager handle OPP
and OPP handle devfreq_update) and the another is when DEVFREQ calls
are implemented through thermal manager (let thermal manager handle
both OPP and devfreq_update), right?

For the two cases, I think that the first approach is more
appropriate. We will probably have more than just a thermal manager to
handle OPP (and devfreq that depends on OPP). The first approach
allows users to declare which frequencies are available and to let the
underlying structure do their work without intervention from users
(those who decide which frequencies are available). With the second
approach, we need to enforce every frequency affecting entity to
understand underlying frequency dependencies and the behavior of
devfreq, not jst to understand the behavior of the device according to
the frequency. DEVFREQ is going to be handled and implemented by the
device driver of the device controlled by the instance of DEVFREQ. By
using the first approach, we can prevent affecting entities other than
the device driver using DEVFREQ from thinking about DEVFREQ of that
device.


>
> just my 2 cents, I think these kind of notifiers should probably
> stay out of opp library OR notifiers be introduced in a generic and safe
> fashion.

So, you want either

to call devfreq_update from where opp_add/enable/disable/... are called.

or

to use notifier in OPP?


Well, the first one would become the source of messy problems as
mentioned earlier. The second one has some overhead, but it's a
general approach and does not incur the issues of the first approach.

Therefore, I prefer either to keep the initial approach regarding
devfreq_update() or to use a generic notifier at OPP (probably, SRCU
notifier chains?).

How about using a SRCU_NOTIFIER at OPP?

The notifier.h says that SRCU has low overhead generally and the most
overhead of SRCU comes from "remove srcu" (probably when "device" is
being removed), which won't happen a lot with OPP.

>
>>       return 0;
>>  }
>>
>> @@ -512,6 +518,9 @@ unlock:
>>       mutex_unlock(&dev_opp_list_lock);
>>  out:
>>       kfree(new_opp);
>> +
>> +     /* Notify generic dvfs for the change and ignore error */
>> +     devfreq_update(dev);
> Another place where I am confused - how does devfreq_update know in what
> context is it being called is the OPP getting enabled/disabled/added -
> devfreq_update has to do some additional work to make it's own decision
> - it is almost as if it uses devfreq_update as a trigger mechanism to reevalute
> the entire decision tree - without using information of the context
> opp_enable/disable/add has to maybe make better call. if this
> information is not needed, maybe some other mechanism implemented by
> devfreq layer might suffice..

No, DEVFREQ does not require the context (OPP being
enabled/disabled/...) with the devfreq_update() calls. It is
essentially a "notify devfreq to reevaluate" and DEVFREQ reevaluates
based on the available frequencies.

Thus, a plain srcu_notifier_call_chain() should work for
DEVFREQ/devfreq_update from OPP.

>
>>       return r;
>>  }
>
> [...]
> --
> Regards,
> Nishanth Menon
>


Thank you for your comments, Nishanth.


Cheers!
- MyungJoo

-- 
MyungJoo Ham (함명주), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
  2011-05-30  5:03   ` MyungJoo Ham
@ 2011-06-16 21:11     ` Rafael J. Wysocki
  2011-07-02 21:30       ` Rafael J. Wysocki
  2011-07-02 21:30       ` Rafael J. Wysocki
  2011-06-16 21:11     ` Rafael J. Wysocki
  1 sibling, 2 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-06-16 21:11 UTC (permalink / raw)
  To: myungjoo.ham, Nishanth Menon
  Cc: linux-pm, linux-kernel, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kyungmin Park, Jiejing Zhang, Colin Cross,
	Thomas Gleixner

Nishanth,

Have your comments been addressed by the message below?

Rafael


On Monday, May 30, 2011, MyungJoo Ham wrote:
> On Sat, May 28, 2011 at 5:57 PM, Nishanth Menon <nm@ti.com> wrote:
> > On 13:53-20110527, MyungJoo Ham wrote:
> > [..]
> >> diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c
> >> index 56a6899..819c1b3 100644
> >> --- a/drivers/base/power/opp.c
> >> +++ b/drivers/base/power/opp.c
> >> @@ -21,6 +21,7 @@
> >>  #include <linux/rculist.h>
> >>  #include <linux/rcupdate.h>
> >>  #include <linux/opp.h>
> >> +#include <linux/devfreq.h>
> >>
> >>  /*
> >>   * Internal data structure organization with the OPP layer library is as
> >> @@ -428,6 +429,11 @@ int opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
> >>       list_add_rcu(&new_opp->node, head);
> >>       mutex_unlock(&dev_opp_list_lock);
> >>
> >> +     /*
> >> +      * Notify generic dvfs for the change and ignore error
> >> +      * because the device may not have a devfreq entry
> >> +      */
> >> +     devfreq_update(dev);
> > I think I understand your thought about notifiers here - we had some
> > initial thought that we may need notifiers, for e.g. clk change
> > notifiers which could implement enable/disable on a dependent device,
> > thermal management etc.. so far, we have'nt had a need to modify the
> > lowest data store layer to handle this. In this case, I think you'd
> > like a notifier mechanism in general for opp_add,enable or disable in
> > the opp library.
> 
> Yes, I wanted a notifier machnaism called by OPP for any potential
> changes in available frequencies. For that purpose, I've added
> devfreq_update() because, for now, DEVFREQ is the only one that needs
> such a notifier and a notifier has some overhead. If there are
> multiple entities that require notifications from OPP for such
> changes, we'd need notifiers maintained by OPP.
> 
> > However, with this change, an opp_add now means:
> > a) opp_add now depends on devfreq_update() to be part of it's integral
> > operation - I think the devfreq should maintain it's own notifier
> > mechanisms and not make OPP library do the notification trigger job of
> > devfreq.
> 
> To let the entities depending on the list of available frequencies be
> notified, I think it would be appropriate for OPP to have a notifier
> chain and call the notifier chain whenever there is a change in the
> list. In that way, we can always add and remove the depending entities
> (including DEVFREQ) to and from OPP's. Otherwise, we'd need OPP users
> that call OPP functions to call the notifier chain when it appears
> that the call will change the list.
> 
> As long as the updates in available frequencies affect other entities,
> OPP library should be able to notify the others whenever such changes
> occur. In this patch, I've used devfreq_update() directly because we
> do not have other entities, yet. However, as you've  mentioned, there
> will be others, so I'm willing to implement notifier in OPP if other
> people also agree.
> 
> > b) By ignoring the error, how will the caller of opp_add now know if the
> > failures were real - e.g. some device failed to act on the notification
> > and the overall opp_add operation failed? And why should OPP library do
> > recovery for on behalf of devfreq?
> 
> Um... OPP does not need to do any recovery for errors from DEVFREQ.
> Other than errors from find_device_devfreq() in devfreq_update(), the
> errors seem to be better returned somehow to the caller of opp_add().
> However, would it be fine to let opp_add (and other
> frequency-availability changing functions) return errors returned from
> a notifier? (either devfreq_update() or srcu_notifier_call_chain())
> 
> > c) How will you ensure the lock accuracy - given that you need to keep
> > to the constraints of opp_add/opp_set_availability here? For example:
> > devfreq_do is called from devfreq_monitor(workqueue) and devfreq_update
> > in addition to having to call a bunch of function pointers whose
> > implementation you do not know, having a function that ensures security
> > of it's data, handles all lock conditions that is imposed differently by
> > add,enable and disable is not visible in this implementation.
> 
> devfreq_update() waits for devfreq_monitor() with devfreq_list_lock.
> Both has devfreq_list_lock locked during their operations.
> 
> > c) How about considering the usage of OPP library with an SoC cpufreq
> > implementation say for MPU AND using the same SoC using devfreq for
> > controlling some devices like MALI in your example simultaneously? Lets
> > say the thermal policy manager implementation for some reason did an
> > opp_disable of a higher OPP of MPU - devfreq_update gets called with the
> > device(mpu_dev).. devfreq_update needs to now invoke it's validation
> > path and fails correctly - Vs devfreq_update being called by some module
> > which actually uses devfreq and the update failed - How does one
> > differentiate between the two?
> 
> When OPP is being used by CPUFREQ w/ CPU and by MALI, the OPP list
> should be attached to the two devices independently.
> In other words, the frequency of CPU should be controlled
> independently from the frequency of OPP. If both CPU and MALI should
> use the same frequency, there is no point to implement devfreq for
> MALI. However, if the condition is not fully-dependent; e.g.,
> Freq(MALI) <= Freq(CPU), we can implement two instances of OPP for
> each of CPU and MALI and let the "target" function supplied to CPU to
> run opp_enable/disable and devfreq_update() to MALI before adjusting
> the frequency of itself.
> 
> When we are implementing another entity that controls OPP/DEVFREQ such
> as the thermal policy manager, it should tackle both CPU and MALI w/
> OPPs at the same time if they are fully independent (i.e., the
> frequency of MALI is not affected by the frequency of CPU). If not,
> the same things (discussed in the previous paragraph) are applied.
> 
> In the two examples you've mentioned, the first one is about the
> implementation of the current patch (let thermal manager handle OPP
> and OPP handle devfreq_update) and the another is when DEVFREQ calls
> are implemented through thermal manager (let thermal manager handle
> both OPP and devfreq_update), right?
> 
> For the two cases, I think that the first approach is more
> appropriate. We will probably have more than just a thermal manager to
> handle OPP (and devfreq that depends on OPP). The first approach
> allows users to declare which frequencies are available and to let the
> underlying structure do their work without intervention from users
> (those who decide which frequencies are available). With the second
> approach, we need to enforce every frequency affecting entity to
> understand underlying frequency dependencies and the behavior of
> devfreq, not jst to understand the behavior of the device according to
> the frequency. DEVFREQ is going to be handled and implemented by the
> device driver of the device controlled by the instance of DEVFREQ. By
> using the first approach, we can prevent affecting entities other than
> the device driver using DEVFREQ from thinking about DEVFREQ of that
> device.
> 
> 
> >
> > just my 2 cents, I think these kind of notifiers should probably
> > stay out of opp library OR notifiers be introduced in a generic and safe
> > fashion.
> 
> So, you want either
> 
> to call devfreq_update from where opp_add/enable/disable/... are called.
> 
> or
> 
> to use notifier in OPP?
> 
> 
> Well, the first one would become the source of messy problems as
> mentioned earlier. The second one has some overhead, but it's a
> general approach and does not incur the issues of the first approach.
> 
> Therefore, I prefer either to keep the initial approach regarding
> devfreq_update() or to use a generic notifier at OPP (probably, SRCU
> notifier chains?).
> 
> How about using a SRCU_NOTIFIER at OPP?
> 
> The notifier.h says that SRCU has low overhead generally and the most
> overhead of SRCU comes from "remove srcu" (probably when "device" is
> being removed), which won't happen a lot with OPP.
> 
> >
> >>       return 0;
> >>  }
> >>
> >> @@ -512,6 +518,9 @@ unlock:
> >>       mutex_unlock(&dev_opp_list_lock);
> >>  out:
> >>       kfree(new_opp);
> >> +
> >> +     /* Notify generic dvfs for the change and ignore error */
> >> +     devfreq_update(dev);
> > Another place where I am confused - how does devfreq_update know in what
> > context is it being called is the OPP getting enabled/disabled/added -
> > devfreq_update has to do some additional work to make it's own decision
> > - it is almost as if it uses devfreq_update as a trigger mechanism to reevalute
> > the entire decision tree - without using information of the context
> > opp_enable/disable/add has to maybe make better call. if this
> > information is not needed, maybe some other mechanism implemented by
> > devfreq layer might suffice..
> 
> No, DEVFREQ does not require the context (OPP being
> enabled/disabled/...) with the devfreq_update() calls. It is
> essentially a "notify devfreq to reevaluate" and DEVFREQ reevaluates
> based on the available frequencies.
> 
> Thus, a plain srcu_notifier_call_chain() should work for
> DEVFREQ/devfreq_update from OPP.
> 
> >
> >>       return r;
> >>  }
> >
> > [...]
> > --
> > Regards,
> > Nishanth Menon
> >
> 
> 
> Thank you for your comments, Nishanth.
> 
> 
> Cheers!
> - MyungJoo
> 
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
  2011-05-30  5:03   ` MyungJoo Ham
  2011-06-16 21:11     ` Rafael J. Wysocki
@ 2011-06-16 21:11     ` Rafael J. Wysocki
  1 sibling, 0 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-06-16 21:11 UTC (permalink / raw)
  To: myungjoo.ham, Nishanth Menon
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	linux-pm, Thomas Gleixner

Nishanth,

Have your comments been addressed by the message below?

Rafael


On Monday, May 30, 2011, MyungJoo Ham wrote:
> On Sat, May 28, 2011 at 5:57 PM, Nishanth Menon <nm@ti.com> wrote:
> > On 13:53-20110527, MyungJoo Ham wrote:
> > [..]
> >> diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c
> >> index 56a6899..819c1b3 100644
> >> --- a/drivers/base/power/opp.c
> >> +++ b/drivers/base/power/opp.c
> >> @@ -21,6 +21,7 @@
> >>  #include <linux/rculist.h>
> >>  #include <linux/rcupdate.h>
> >>  #include <linux/opp.h>
> >> +#include <linux/devfreq.h>
> >>
> >>  /*
> >>   * Internal data structure organization with the OPP layer library is as
> >> @@ -428,6 +429,11 @@ int opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
> >>       list_add_rcu(&new_opp->node, head);
> >>       mutex_unlock(&dev_opp_list_lock);
> >>
> >> +     /*
> >> +      * Notify generic dvfs for the change and ignore error
> >> +      * because the device may not have a devfreq entry
> >> +      */
> >> +     devfreq_update(dev);
> > I think I understand your thought about notifiers here - we had some
> > initial thought that we may need notifiers, for e.g. clk change
> > notifiers which could implement enable/disable on a dependent device,
> > thermal management etc.. so far, we have'nt had a need to modify the
> > lowest data store layer to handle this. In this case, I think you'd
> > like a notifier mechanism in general for opp_add,enable or disable in
> > the opp library.
> 
> Yes, I wanted a notifier machnaism called by OPP for any potential
> changes in available frequencies. For that purpose, I've added
> devfreq_update() because, for now, DEVFREQ is the only one that needs
> such a notifier and a notifier has some overhead. If there are
> multiple entities that require notifications from OPP for such
> changes, we'd need notifiers maintained by OPP.
> 
> > However, with this change, an opp_add now means:
> > a) opp_add now depends on devfreq_update() to be part of it's integral
> > operation - I think the devfreq should maintain it's own notifier
> > mechanisms and not make OPP library do the notification trigger job of
> > devfreq.
> 
> To let the entities depending on the list of available frequencies be
> notified, I think it would be appropriate for OPP to have a notifier
> chain and call the notifier chain whenever there is a change in the
> list. In that way, we can always add and remove the depending entities
> (including DEVFREQ) to and from OPP's. Otherwise, we'd need OPP users
> that call OPP functions to call the notifier chain when it appears
> that the call will change the list.
> 
> As long as the updates in available frequencies affect other entities,
> OPP library should be able to notify the others whenever such changes
> occur. In this patch, I've used devfreq_update() directly because we
> do not have other entities, yet. However, as you've  mentioned, there
> will be others, so I'm willing to implement notifier in OPP if other
> people also agree.
> 
> > b) By ignoring the error, how will the caller of opp_add now know if the
> > failures were real - e.g. some device failed to act on the notification
> > and the overall opp_add operation failed? And why should OPP library do
> > recovery for on behalf of devfreq?
> 
> Um... OPP does not need to do any recovery for errors from DEVFREQ.
> Other than errors from find_device_devfreq() in devfreq_update(), the
> errors seem to be better returned somehow to the caller of opp_add().
> However, would it be fine to let opp_add (and other
> frequency-availability changing functions) return errors returned from
> a notifier? (either devfreq_update() or srcu_notifier_call_chain())
> 
> > c) How will you ensure the lock accuracy - given that you need to keep
> > to the constraints of opp_add/opp_set_availability here? For example:
> > devfreq_do is called from devfreq_monitor(workqueue) and devfreq_update
> > in addition to having to call a bunch of function pointers whose
> > implementation you do not know, having a function that ensures security
> > of it's data, handles all lock conditions that is imposed differently by
> > add,enable and disable is not visible in this implementation.
> 
> devfreq_update() waits for devfreq_monitor() with devfreq_list_lock.
> Both has devfreq_list_lock locked during their operations.
> 
> > c) How about considering the usage of OPP library with an SoC cpufreq
> > implementation say for MPU AND using the same SoC using devfreq for
> > controlling some devices like MALI in your example simultaneously? Lets
> > say the thermal policy manager implementation for some reason did an
> > opp_disable of a higher OPP of MPU - devfreq_update gets called with the
> > device(mpu_dev).. devfreq_update needs to now invoke it's validation
> > path and fails correctly - Vs devfreq_update being called by some module
> > which actually uses devfreq and the update failed - How does one
> > differentiate between the two?
> 
> When OPP is being used by CPUFREQ w/ CPU and by MALI, the OPP list
> should be attached to the two devices independently.
> In other words, the frequency of CPU should be controlled
> independently from the frequency of OPP. If both CPU and MALI should
> use the same frequency, there is no point to implement devfreq for
> MALI. However, if the condition is not fully-dependent; e.g.,
> Freq(MALI) <= Freq(CPU), we can implement two instances of OPP for
> each of CPU and MALI and let the "target" function supplied to CPU to
> run opp_enable/disable and devfreq_update() to MALI before adjusting
> the frequency of itself.
> 
> When we are implementing another entity that controls OPP/DEVFREQ such
> as the thermal policy manager, it should tackle both CPU and MALI w/
> OPPs at the same time if they are fully independent (i.e., the
> frequency of MALI is not affected by the frequency of CPU). If not,
> the same things (discussed in the previous paragraph) are applied.
> 
> In the two examples you've mentioned, the first one is about the
> implementation of the current patch (let thermal manager handle OPP
> and OPP handle devfreq_update) and the another is when DEVFREQ calls
> are implemented through thermal manager (let thermal manager handle
> both OPP and devfreq_update), right?
> 
> For the two cases, I think that the first approach is more
> appropriate. We will probably have more than just a thermal manager to
> handle OPP (and devfreq that depends on OPP). The first approach
> allows users to declare which frequencies are available and to let the
> underlying structure do their work without intervention from users
> (those who decide which frequencies are available). With the second
> approach, we need to enforce every frequency affecting entity to
> understand underlying frequency dependencies and the behavior of
> devfreq, not jst to understand the behavior of the device according to
> the frequency. DEVFREQ is going to be handled and implemented by the
> device driver of the device controlled by the instance of DEVFREQ. By
> using the first approach, we can prevent affecting entities other than
> the device driver using DEVFREQ from thinking about DEVFREQ of that
> device.
> 
> 
> >
> > just my 2 cents, I think these kind of notifiers should probably
> > stay out of opp library OR notifiers be introduced in a generic and safe
> > fashion.
> 
> So, you want either
> 
> to call devfreq_update from where opp_add/enable/disable/... are called.
> 
> or
> 
> to use notifier in OPP?
> 
> 
> Well, the first one would become the source of messy problems as
> mentioned earlier. The second one has some overhead, but it's a
> general approach and does not incur the issues of the first approach.
> 
> Therefore, I prefer either to keep the initial approach regarding
> devfreq_update() or to use a generic notifier at OPP (probably, SRCU
> notifier chains?).
> 
> How about using a SRCU_NOTIFIER at OPP?
> 
> The notifier.h says that SRCU has low overhead generally and the most
> overhead of SRCU comes from "remove srcu" (probably when "device" is
> being removed), which won't happen a lot with OPP.
> 
> >
> >>       return 0;
> >>  }
> >>
> >> @@ -512,6 +518,9 @@ unlock:
> >>       mutex_unlock(&dev_opp_list_lock);
> >>  out:
> >>       kfree(new_opp);
> >> +
> >> +     /* Notify generic dvfs for the change and ignore error */
> >> +     devfreq_update(dev);
> > Another place where I am confused - how does devfreq_update know in what
> > context is it being called is the OPP getting enabled/disabled/added -
> > devfreq_update has to do some additional work to make it's own decision
> > - it is almost as if it uses devfreq_update as a trigger mechanism to reevalute
> > the entire decision tree - without using information of the context
> > opp_enable/disable/add has to maybe make better call. if this
> > information is not needed, maybe some other mechanism implemented by
> > devfreq layer might suffice..
> 
> No, DEVFREQ does not require the context (OPP being
> enabled/disabled/...) with the devfreq_update() calls. It is
> essentially a "notify devfreq to reevaluate" and DEVFREQ reevaluates
> based on the available frequencies.
> 
> Thus, a plain srcu_notifier_call_chain() should work for
> DEVFREQ/devfreq_update from OPP.
> 
> >
> >>       return r;
> >>  }
> >
> > [...]
> > --
> > Regards,
> > Nishanth Menon
> >
> 
> 
> Thank you for your comments, Nishanth.
> 
> 
> Cheers!
> - MyungJoo
> 
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
  2011-06-16 21:11     ` Rafael J. Wysocki
  2011-07-02 21:30       ` Rafael J. Wysocki
@ 2011-07-02 21:30       ` Rafael J. Wysocki
  1 sibling, 0 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-07-02 21:30 UTC (permalink / raw)
  To: myungjoo.ham
  Cc: Nishanth Menon, linux-pm, linux-kernel, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kyungmin Park, Jiejing Zhang, Colin Cross,
	Thomas Gleixner

On Thursday, June 16, 2011, Rafael J. Wysocki wrote:
> Nishanth,
> 
> Have your comments been addressed by the message below?

>From the silence I gather they have.

Thanks,
Rafael


 
> On Monday, May 30, 2011, MyungJoo Ham wrote:
> > On Sat, May 28, 2011 at 5:57 PM, Nishanth Menon <nm@ti.com> wrote:
> > > On 13:53-20110527, MyungJoo Ham wrote:
> > > [..]
> > >> diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c
> > >> index 56a6899..819c1b3 100644
> > >> --- a/drivers/base/power/opp.c
> > >> +++ b/drivers/base/power/opp.c
> > >> @@ -21,6 +21,7 @@
> > >>  #include <linux/rculist.h>
> > >>  #include <linux/rcupdate.h>
> > >>  #include <linux/opp.h>
> > >> +#include <linux/devfreq.h>
> > >>
> > >>  /*
> > >>   * Internal data structure organization with the OPP layer library is as
> > >> @@ -428,6 +429,11 @@ int opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
> > >>       list_add_rcu(&new_opp->node, head);
> > >>       mutex_unlock(&dev_opp_list_lock);
> > >>
> > >> +     /*
> > >> +      * Notify generic dvfs for the change and ignore error
> > >> +      * because the device may not have a devfreq entry
> > >> +      */
> > >> +     devfreq_update(dev);
> > > I think I understand your thought about notifiers here - we had some
> > > initial thought that we may need notifiers, for e.g. clk change
> > > notifiers which could implement enable/disable on a dependent device,
> > > thermal management etc.. so far, we have'nt had a need to modify the
> > > lowest data store layer to handle this. In this case, I think you'd
> > > like a notifier mechanism in general for opp_add,enable or disable in
> > > the opp library.
> > 
> > Yes, I wanted a notifier machnaism called by OPP for any potential
> > changes in available frequencies. For that purpose, I've added
> > devfreq_update() because, for now, DEVFREQ is the only one that needs
> > such a notifier and a notifier has some overhead. If there are
> > multiple entities that require notifications from OPP for such
> > changes, we'd need notifiers maintained by OPP.
> > 
> > > However, with this change, an opp_add now means:
> > > a) opp_add now depends on devfreq_update() to be part of it's integral
> > > operation - I think the devfreq should maintain it's own notifier
> > > mechanisms and not make OPP library do the notification trigger job of
> > > devfreq.
> > 
> > To let the entities depending on the list of available frequencies be
> > notified, I think it would be appropriate for OPP to have a notifier
> > chain and call the notifier chain whenever there is a change in the
> > list. In that way, we can always add and remove the depending entities
> > (including DEVFREQ) to and from OPP's. Otherwise, we'd need OPP users
> > that call OPP functions to call the notifier chain when it appears
> > that the call will change the list.
> > 
> > As long as the updates in available frequencies affect other entities,
> > OPP library should be able to notify the others whenever such changes
> > occur. In this patch, I've used devfreq_update() directly because we
> > do not have other entities, yet. However, as you've  mentioned, there
> > will be others, so I'm willing to implement notifier in OPP if other
> > people also agree.
> > 
> > > b) By ignoring the error, how will the caller of opp_add now know if the
> > > failures were real - e.g. some device failed to act on the notification
> > > and the overall opp_add operation failed? And why should OPP library do
> > > recovery for on behalf of devfreq?
> > 
> > Um... OPP does not need to do any recovery for errors from DEVFREQ.
> > Other than errors from find_device_devfreq() in devfreq_update(), the
> > errors seem to be better returned somehow to the caller of opp_add().
> > However, would it be fine to let opp_add (and other
> > frequency-availability changing functions) return errors returned from
> > a notifier? (either devfreq_update() or srcu_notifier_call_chain())
> > 
> > > c) How will you ensure the lock accuracy - given that you need to keep
> > > to the constraints of opp_add/opp_set_availability here? For example:
> > > devfreq_do is called from devfreq_monitor(workqueue) and devfreq_update
> > > in addition to having to call a bunch of function pointers whose
> > > implementation you do not know, having a function that ensures security
> > > of it's data, handles all lock conditions that is imposed differently by
> > > add,enable and disable is not visible in this implementation.
> > 
> > devfreq_update() waits for devfreq_monitor() with devfreq_list_lock.
> > Both has devfreq_list_lock locked during their operations.
> > 
> > > c) How about considering the usage of OPP library with an SoC cpufreq
> > > implementation say for MPU AND using the same SoC using devfreq for
> > > controlling some devices like MALI in your example simultaneously? Lets
> > > say the thermal policy manager implementation for some reason did an
> > > opp_disable of a higher OPP of MPU - devfreq_update gets called with the
> > > device(mpu_dev).. devfreq_update needs to now invoke it's validation
> > > path and fails correctly - Vs devfreq_update being called by some module
> > > which actually uses devfreq and the update failed - How does one
> > > differentiate between the two?
> > 
> > When OPP is being used by CPUFREQ w/ CPU and by MALI, the OPP list
> > should be attached to the two devices independently.
> > In other words, the frequency of CPU should be controlled
> > independently from the frequency of OPP. If both CPU and MALI should
> > use the same frequency, there is no point to implement devfreq for
> > MALI. However, if the condition is not fully-dependent; e.g.,
> > Freq(MALI) <= Freq(CPU), we can implement two instances of OPP for
> > each of CPU and MALI and let the "target" function supplied to CPU to
> > run opp_enable/disable and devfreq_update() to MALI before adjusting
> > the frequency of itself.
> > 
> > When we are implementing another entity that controls OPP/DEVFREQ such
> > as the thermal policy manager, it should tackle both CPU and MALI w/
> > OPPs at the same time if they are fully independent (i.e., the
> > frequency of MALI is not affected by the frequency of CPU). If not,
> > the same things (discussed in the previous paragraph) are applied.
> > 
> > In the two examples you've mentioned, the first one is about the
> > implementation of the current patch (let thermal manager handle OPP
> > and OPP handle devfreq_update) and the another is when DEVFREQ calls
> > are implemented through thermal manager (let thermal manager handle
> > both OPP and devfreq_update), right?
> > 
> > For the two cases, I think that the first approach is more
> > appropriate. We will probably have more than just a thermal manager to
> > handle OPP (and devfreq that depends on OPP). The first approach
> > allows users to declare which frequencies are available and to let the
> > underlying structure do their work without intervention from users
> > (those who decide which frequencies are available). With the second
> > approach, we need to enforce every frequency affecting entity to
> > understand underlying frequency dependencies and the behavior of
> > devfreq, not jst to understand the behavior of the device according to
> > the frequency. DEVFREQ is going to be handled and implemented by the
> > device driver of the device controlled by the instance of DEVFREQ. By
> > using the first approach, we can prevent affecting entities other than
> > the device driver using DEVFREQ from thinking about DEVFREQ of that
> > device.
> > 
> > 
> > >
> > > just my 2 cents, I think these kind of notifiers should probably
> > > stay out of opp library OR notifiers be introduced in a generic and safe
> > > fashion.
> > 
> > So, you want either
> > 
> > to call devfreq_update from where opp_add/enable/disable/... are called.
> > 
> > or
> > 
> > to use notifier in OPP?
> > 
> > 
> > Well, the first one would become the source of messy problems as
> > mentioned earlier. The second one has some overhead, but it's a
> > general approach and does not incur the issues of the first approach.
> > 
> > Therefore, I prefer either to keep the initial approach regarding
> > devfreq_update() or to use a generic notifier at OPP (probably, SRCU
> > notifier chains?).
> > 
> > How about using a SRCU_NOTIFIER at OPP?
> > 
> > The notifier.h says that SRCU has low overhead generally and the most
> > overhead of SRCU comes from "remove srcu" (probably when "device" is
> > being removed), which won't happen a lot with OPP.
> > 
> > >
> > >>       return 0;
> > >>  }
> > >>
> > >> @@ -512,6 +518,9 @@ unlock:
> > >>       mutex_unlock(&dev_opp_list_lock);
> > >>  out:
> > >>       kfree(new_opp);
> > >> +
> > >> +     /* Notify generic dvfs for the change and ignore error */
> > >> +     devfreq_update(dev);
> > > Another place where I am confused - how does devfreq_update know in what
> > > context is it being called is the OPP getting enabled/disabled/added -
> > > devfreq_update has to do some additional work to make it's own decision
> > > - it is almost as if it uses devfreq_update as a trigger mechanism to reevalute
> > > the entire decision tree - without using information of the context
> > > opp_enable/disable/add has to maybe make better call. if this
> > > information is not needed, maybe some other mechanism implemented by
> > > devfreq layer might suffice..
> > 
> > No, DEVFREQ does not require the context (OPP being
> > enabled/disabled/...) with the devfreq_update() calls. It is
> > essentially a "notify devfreq to reevaluate" and DEVFREQ reevaluates
> > based on the available frequencies.
> > 
> > Thus, a plain srcu_notifier_call_chain() should work for
> > DEVFREQ/devfreq_update from OPP.
> > 
> > >
> > >>       return r;
> > >>  }
> > >
> > > [...]
> > > --
> > > Regards,
> > > Nishanth Menon
> > >
> > 
> > 
> > Thank you for your comments, Nishanth.
> > 
> > 
> > Cheers!
> > - MyungJoo
> > 
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
  2011-06-16 21:11     ` Rafael J. Wysocki
@ 2011-07-02 21:30       ` Rafael J. Wysocki
  2011-07-02 21:30       ` Rafael J. Wysocki
  1 sibling, 0 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-07-02 21:30 UTC (permalink / raw)
  To: myungjoo.ham
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	linux-pm, Thomas Gleixner

On Thursday, June 16, 2011, Rafael J. Wysocki wrote:
> Nishanth,
> 
> Have your comments been addressed by the message below?

>From the silence I gather they have.

Thanks,
Rafael


 
> On Monday, May 30, 2011, MyungJoo Ham wrote:
> > On Sat, May 28, 2011 at 5:57 PM, Nishanth Menon <nm@ti.com> wrote:
> > > On 13:53-20110527, MyungJoo Ham wrote:
> > > [..]
> > >> diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c
> > >> index 56a6899..819c1b3 100644
> > >> --- a/drivers/base/power/opp.c
> > >> +++ b/drivers/base/power/opp.c
> > >> @@ -21,6 +21,7 @@
> > >>  #include <linux/rculist.h>
> > >>  #include <linux/rcupdate.h>
> > >>  #include <linux/opp.h>
> > >> +#include <linux/devfreq.h>
> > >>
> > >>  /*
> > >>   * Internal data structure organization with the OPP layer library is as
> > >> @@ -428,6 +429,11 @@ int opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
> > >>       list_add_rcu(&new_opp->node, head);
> > >>       mutex_unlock(&dev_opp_list_lock);
> > >>
> > >> +     /*
> > >> +      * Notify generic dvfs for the change and ignore error
> > >> +      * because the device may not have a devfreq entry
> > >> +      */
> > >> +     devfreq_update(dev);
> > > I think I understand your thought about notifiers here - we had some
> > > initial thought that we may need notifiers, for e.g. clk change
> > > notifiers which could implement enable/disable on a dependent device,
> > > thermal management etc.. so far, we have'nt had a need to modify the
> > > lowest data store layer to handle this. In this case, I think you'd
> > > like a notifier mechanism in general for opp_add,enable or disable in
> > > the opp library.
> > 
> > Yes, I wanted a notifier machnaism called by OPP for any potential
> > changes in available frequencies. For that purpose, I've added
> > devfreq_update() because, for now, DEVFREQ is the only one that needs
> > such a notifier and a notifier has some overhead. If there are
> > multiple entities that require notifications from OPP for such
> > changes, we'd need notifiers maintained by OPP.
> > 
> > > However, with this change, an opp_add now means:
> > > a) opp_add now depends on devfreq_update() to be part of it's integral
> > > operation - I think the devfreq should maintain it's own notifier
> > > mechanisms and not make OPP library do the notification trigger job of
> > > devfreq.
> > 
> > To let the entities depending on the list of available frequencies be
> > notified, I think it would be appropriate for OPP to have a notifier
> > chain and call the notifier chain whenever there is a change in the
> > list. In that way, we can always add and remove the depending entities
> > (including DEVFREQ) to and from OPP's. Otherwise, we'd need OPP users
> > that call OPP functions to call the notifier chain when it appears
> > that the call will change the list.
> > 
> > As long as the updates in available frequencies affect other entities,
> > OPP library should be able to notify the others whenever such changes
> > occur. In this patch, I've used devfreq_update() directly because we
> > do not have other entities, yet. However, as you've  mentioned, there
> > will be others, so I'm willing to implement notifier in OPP if other
> > people also agree.
> > 
> > > b) By ignoring the error, how will the caller of opp_add now know if the
> > > failures were real - e.g. some device failed to act on the notification
> > > and the overall opp_add operation failed? And why should OPP library do
> > > recovery for on behalf of devfreq?
> > 
> > Um... OPP does not need to do any recovery for errors from DEVFREQ.
> > Other than errors from find_device_devfreq() in devfreq_update(), the
> > errors seem to be better returned somehow to the caller of opp_add().
> > However, would it be fine to let opp_add (and other
> > frequency-availability changing functions) return errors returned from
> > a notifier? (either devfreq_update() or srcu_notifier_call_chain())
> > 
> > > c) How will you ensure the lock accuracy - given that you need to keep
> > > to the constraints of opp_add/opp_set_availability here? For example:
> > > devfreq_do is called from devfreq_monitor(workqueue) and devfreq_update
> > > in addition to having to call a bunch of function pointers whose
> > > implementation you do not know, having a function that ensures security
> > > of it's data, handles all lock conditions that is imposed differently by
> > > add,enable and disable is not visible in this implementation.
> > 
> > devfreq_update() waits for devfreq_monitor() with devfreq_list_lock.
> > Both has devfreq_list_lock locked during their operations.
> > 
> > > c) How about considering the usage of OPP library with an SoC cpufreq
> > > implementation say for MPU AND using the same SoC using devfreq for
> > > controlling some devices like MALI in your example simultaneously? Lets
> > > say the thermal policy manager implementation for some reason did an
> > > opp_disable of a higher OPP of MPU - devfreq_update gets called with the
> > > device(mpu_dev).. devfreq_update needs to now invoke it's validation
> > > path and fails correctly - Vs devfreq_update being called by some module
> > > which actually uses devfreq and the update failed - How does one
> > > differentiate between the two?
> > 
> > When OPP is being used by CPUFREQ w/ CPU and by MALI, the OPP list
> > should be attached to the two devices independently.
> > In other words, the frequency of CPU should be controlled
> > independently from the frequency of OPP. If both CPU and MALI should
> > use the same frequency, there is no point to implement devfreq for
> > MALI. However, if the condition is not fully-dependent; e.g.,
> > Freq(MALI) <= Freq(CPU), we can implement two instances of OPP for
> > each of CPU and MALI and let the "target" function supplied to CPU to
> > run opp_enable/disable and devfreq_update() to MALI before adjusting
> > the frequency of itself.
> > 
> > When we are implementing another entity that controls OPP/DEVFREQ such
> > as the thermal policy manager, it should tackle both CPU and MALI w/
> > OPPs at the same time if they are fully independent (i.e., the
> > frequency of MALI is not affected by the frequency of CPU). If not,
> > the same things (discussed in the previous paragraph) are applied.
> > 
> > In the two examples you've mentioned, the first one is about the
> > implementation of the current patch (let thermal manager handle OPP
> > and OPP handle devfreq_update) and the another is when DEVFREQ calls
> > are implemented through thermal manager (let thermal manager handle
> > both OPP and devfreq_update), right?
> > 
> > For the two cases, I think that the first approach is more
> > appropriate. We will probably have more than just a thermal manager to
> > handle OPP (and devfreq that depends on OPP). The first approach
> > allows users to declare which frequencies are available and to let the
> > underlying structure do their work without intervention from users
> > (those who decide which frequencies are available). With the second
> > approach, we need to enforce every frequency affecting entity to
> > understand underlying frequency dependencies and the behavior of
> > devfreq, not jst to understand the behavior of the device according to
> > the frequency. DEVFREQ is going to be handled and implemented by the
> > device driver of the device controlled by the instance of DEVFREQ. By
> > using the first approach, we can prevent affecting entities other than
> > the device driver using DEVFREQ from thinking about DEVFREQ of that
> > device.
> > 
> > 
> > >
> > > just my 2 cents, I think these kind of notifiers should probably
> > > stay out of opp library OR notifiers be introduced in a generic and safe
> > > fashion.
> > 
> > So, you want either
> > 
> > to call devfreq_update from where opp_add/enable/disable/... are called.
> > 
> > or
> > 
> > to use notifier in OPP?
> > 
> > 
> > Well, the first one would become the source of messy problems as
> > mentioned earlier. The second one has some overhead, but it's a
> > general approach and does not incur the issues of the first approach.
> > 
> > Therefore, I prefer either to keep the initial approach regarding
> > devfreq_update() or to use a generic notifier at OPP (probably, SRCU
> > notifier chains?).
> > 
> > How about using a SRCU_NOTIFIER at OPP?
> > 
> > The notifier.h says that SRCU has low overhead generally and the most
> > overhead of SRCU comes from "remove srcu" (probably when "device" is
> > being removed), which won't happen a lot with OPP.
> > 
> > >
> > >>       return 0;
> > >>  }
> > >>
> > >> @@ -512,6 +518,9 @@ unlock:
> > >>       mutex_unlock(&dev_opp_list_lock);
> > >>  out:
> > >>       kfree(new_opp);
> > >> +
> > >> +     /* Notify generic dvfs for the change and ignore error */
> > >> +     devfreq_update(dev);
> > > Another place where I am confused - how does devfreq_update know in what
> > > context is it being called is the OPP getting enabled/disabled/added -
> > > devfreq_update has to do some additional work to make it's own decision
> > > - it is almost as if it uses devfreq_update as a trigger mechanism to reevalute
> > > the entire decision tree - without using information of the context
> > > opp_enable/disable/add has to maybe make better call. if this
> > > information is not needed, maybe some other mechanism implemented by
> > > devfreq layer might suffice..
> > 
> > No, DEVFREQ does not require the context (OPP being
> > enabled/disabled/...) with the devfreq_update() calls. It is
> > essentially a "notify devfreq to reevaluate" and DEVFREQ reevaluates
> > based on the available frequencies.
> > 
> > Thus, a plain srcu_notifier_call_chain() should work for
> > DEVFREQ/devfreq_update from OPP.
> > 
> > >
> > >>       return r;
> > >>  }
> > >
> > > [...]
> > > --
> > > Regards,
> > > Nishanth Menon
> > >
> > 
> > 
> > Thank you for your comments, Nishanth.
> > 
> > 
> > Cheers!
> > - MyungJoo
> > 
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
  2011-05-27  4:53 [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs MyungJoo Ham
                   ` (6 preceding siblings ...)
  2011-07-02 21:56 ` Rafael J. Wysocki
@ 2011-07-02 21:56 ` Rafael J. Wysocki
  2011-07-04  8:34   ` MyungJoo Ham
  2011-07-04  8:34   ` MyungJoo Ham
  7 siblings, 2 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-07-02 21:56 UTC (permalink / raw)
  To: MyungJoo Ham
  Cc: linux-pm, linux-kernel, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kyungmin Park, Jiejing Zhang, Colin Cross,
	Nishanth Menon, Thomas Gleixner, myungjoo.ham

Hi,

I'm not sure if spelling DEVFREQ in the kerneldoc comments with capitals is
a good idea, it looks kind of odd.  I'm not sure what would be look better,
though.

On Friday, May 27, 2011, MyungJoo Ham wrote:

...
> +/**
> + * devfreq_monitor() - Regularly run devfreq_do() and support device DEVFREQ tickle.

I'd say "periodically" istead of "regularly".

> + * @work: the work struct used to run devfreq_monitor periodically.

Also please say something more about the "tickle" thinkg here.

> + */
> +static void devfreq_monitor(struct work_struct *work)
> +{
> +	struct devfreq *devfreq;
> +	int error;
> +	bool continue_polling = false;
> +	struct devfreq *to_remove = NULL;
> +
> +	mutex_lock(&devfreq_list_lock);
> +
> +	list_for_each_entry(devfreq, &devfreq_list, node) {
> +		/* Remove the devfreq entry that failed */
> +		if (to_remove) {
> +			list_del(&to_remove->node);
> +			kfree(to_remove);
> +			to_remove = NULL;
> +		}
> +
> +		/*
> +		 * If the device is tickled and the tickle duration is left,
> +		 * do not change the frequency for a while
> +		 */
> +		if (devfreq->tickle) {
> +			continue_polling = true;
> +			devfreq->tickle--;
> +
> +			/*
> +			 * If the tickle is ending and the device is not going
> +			 * to poll, force the device to poll next time so that
> +			 * it can return to the original frequency afterwards.
> +			 * However, non-polling device will have 0 polling_ms,
> +			 * it will not poll again later.
> +			 */
> +			if (devfreq->tickle == 0 && devfreq->next_polling == 0)
> +				devfreq->next_polling = 1;
> +
> +			continue;
> +		}
> +
> +		/* This device does not require polling */
> +		if (devfreq->next_polling == 0)
> +			continue;
> +
> +		continue_polling = true;
> +
> +		if (devfreq->next_polling == 1) {
> +			/* This device is polling this time */

I'd remove this comment, it's confusing IMO.  Besides, it may be better
to structure the code like this:

if (devfreq->next_polling-- == 1) {
}

and then you wouldn't need the "else" at all.

> +			error = devfreq_do(devfreq);
> +			if (error && error != -EAGAIN) {
> +				/*
> +				 * Remove a devfreq with error. However,
> +				 * We cannot remove it right here because the

Comma after "here", please.

> +				 * devfreq pointer is going to be used by
> +				 * list_for_each_entry above. Thus, it is
> +				 * removed afterwards.

Why don't you use list_for_each_entry_safe(), then?

> +				 */
> +				to_remove = devfreq->dev;
> +				dev_err(devfreq->dev, "devfreq_do error(%d). "
> +					"DEVFREQ is removed from the device\n",
> +					error);
> +				continue;
> +			}
> +			devfreq->next_polling = DIV_ROUND_UP(
> +						devfreq->profile->polling_ms,
> +						DEVFREQ_INTERVAL);
> +		} else {
> +			/* The device will poll later when next_polling = 1 */
> +			devfreq->next_polling--;
> +		}
> +	}
> +
> +	if (to_remove) {
> +		list_del(&to_remove->node);
> +		kfree(to_remove);
> +		to_remove = NULL;
> +	}
> +
> +	if (continue_polling) {
> +		polling = true;
> +		queue_delayed_work(devfreq_wq, &devfreq_work,
> +				   msecs_to_jiffies(DEVFREQ_INTERVAL));
> +	} else {
> +		polling = false;
> +	}

OK, so why exactly continue_polling is needed?  It seems you might siply use
"polling" instead of it above.

> +
> +	mutex_unlock(&devfreq_list_lock);
> +}
...
> +/**
> + * devfreq_update() - Notify that the device OPP has been changed.
> + * @dev:	the device whose OPP has been changed.
> + * @may_not_exist:	do not print error message even if the device
> + *			does not have devfreq entry.

This argument isn't used any more.

> + */
> +int devfreq_update(struct device *dev)
> +{
> +	struct devfreq *devfreq;
> +	int err = 0;
> +
> +	mutex_lock(&devfreq_list_lock);
> +
> +	devfreq = find_device_devfreq(dev);
> +	if (IS_ERR(devfreq)) {
> +		err = PTR_ERR(devfreq);
> +		goto out;
> +	}
> +
> +	if (devfreq->tickle) {
> +		/* If the max freq available is changed, re-tickle */

It would be good to explain what "re-tickle" means.

> +		unsigned long freq = devfreq->profile->max_freq;
> +		struct opp *opp = opp_find_freq_floor(devfreq->dev, &freq);
> +
> +		if (IS_ERR(opp)) {
> +			err = PTR_ERR(opp);
> +			goto out;
> +		}
> +
> +		/* Max freq available is not changed */
> +		if (devfreq->previous_freq == freq)
> +			goto out;
> +
> +		err = devfreq->profile->target(devfreq->dev, opp);
> +		if (!err)
> +			devfreq->previous_freq = freq;

Why don't we run devfreq_do() in this case?

> +	} else {
> +		/* Reevaluate the proper frequency */
> +		err = devfreq_do(devfreq);
> +	}
> +
> +out:
> +	mutex_unlock(&devfreq_list_lock);
> +	return err;
> +}
> +

Add a kerneldoc comment here, please.

> +static int _devfreq_tickle_device(struct devfreq *df, unsigned long delay)
> +{
> +	int err = 0;
> +	unsigned long freq;
> +	struct opp *opp;
> +
> +	freq = df->profile->max_freq;
> +	opp = opp_find_freq_floor(df->dev, &freq);
> +	if (IS_ERR(opp))
> +		return PTR_ERR(opp);
> +
> +	if (df->previous_freq != freq) {
> +		err = df->profile->target(df->dev, opp);
> +		if (!err)
> +			df->previous_freq = freq;
> +	}
> +	if (err) {
> +		dev_err(df->dev, "%s: Cannot set frequency.\n", __func__);
> +	} else {
> +		df->tickle = delay;
> +		df->num_tickle++;
> +	}
> +
> +	if (devfreq_wq && !polling) {
> +		polling = true;
> +		queue_delayed_work(devfreq_wq, &devfreq_work,
> +				msecs_to_jiffies(DEVFREQ_INTERVAL));
> +	}
> +
> +	return err;
> +}

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
  2011-05-27  4:53 [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs MyungJoo Ham
                   ` (5 preceding siblings ...)
  2011-05-28  8:57 ` Nishanth Menon
@ 2011-07-02 21:56 ` Rafael J. Wysocki
  2011-07-02 21:56 ` Rafael J. Wysocki
  7 siblings, 0 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-07-02 21:56 UTC (permalink / raw)
  To: MyungJoo Ham
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	linux-pm, Thomas Gleixner

Hi,

I'm not sure if spelling DEVFREQ in the kerneldoc comments with capitals is
a good idea, it looks kind of odd.  I'm not sure what would be look better,
though.

On Friday, May 27, 2011, MyungJoo Ham wrote:

...
> +/**
> + * devfreq_monitor() - Regularly run devfreq_do() and support device DEVFREQ tickle.

I'd say "periodically" istead of "regularly".

> + * @work: the work struct used to run devfreq_monitor periodically.

Also please say something more about the "tickle" thinkg here.

> + */
> +static void devfreq_monitor(struct work_struct *work)
> +{
> +	struct devfreq *devfreq;
> +	int error;
> +	bool continue_polling = false;
> +	struct devfreq *to_remove = NULL;
> +
> +	mutex_lock(&devfreq_list_lock);
> +
> +	list_for_each_entry(devfreq, &devfreq_list, node) {
> +		/* Remove the devfreq entry that failed */
> +		if (to_remove) {
> +			list_del(&to_remove->node);
> +			kfree(to_remove);
> +			to_remove = NULL;
> +		}
> +
> +		/*
> +		 * If the device is tickled and the tickle duration is left,
> +		 * do not change the frequency for a while
> +		 */
> +		if (devfreq->tickle) {
> +			continue_polling = true;
> +			devfreq->tickle--;
> +
> +			/*
> +			 * If the tickle is ending and the device is not going
> +			 * to poll, force the device to poll next time so that
> +			 * it can return to the original frequency afterwards.
> +			 * However, non-polling device will have 0 polling_ms,
> +			 * it will not poll again later.
> +			 */
> +			if (devfreq->tickle == 0 && devfreq->next_polling == 0)
> +				devfreq->next_polling = 1;
> +
> +			continue;
> +		}
> +
> +		/* This device does not require polling */
> +		if (devfreq->next_polling == 0)
> +			continue;
> +
> +		continue_polling = true;
> +
> +		if (devfreq->next_polling == 1) {
> +			/* This device is polling this time */

I'd remove this comment, it's confusing IMO.  Besides, it may be better
to structure the code like this:

if (devfreq->next_polling-- == 1) {
}

and then you wouldn't need the "else" at all.

> +			error = devfreq_do(devfreq);
> +			if (error && error != -EAGAIN) {
> +				/*
> +				 * Remove a devfreq with error. However,
> +				 * We cannot remove it right here because the

Comma after "here", please.

> +				 * devfreq pointer is going to be used by
> +				 * list_for_each_entry above. Thus, it is
> +				 * removed afterwards.

Why don't you use list_for_each_entry_safe(), then?

> +				 */
> +				to_remove = devfreq->dev;
> +				dev_err(devfreq->dev, "devfreq_do error(%d). "
> +					"DEVFREQ is removed from the device\n",
> +					error);
> +				continue;
> +			}
> +			devfreq->next_polling = DIV_ROUND_UP(
> +						devfreq->profile->polling_ms,
> +						DEVFREQ_INTERVAL);
> +		} else {
> +			/* The device will poll later when next_polling = 1 */
> +			devfreq->next_polling--;
> +		}
> +	}
> +
> +	if (to_remove) {
> +		list_del(&to_remove->node);
> +		kfree(to_remove);
> +		to_remove = NULL;
> +	}
> +
> +	if (continue_polling) {
> +		polling = true;
> +		queue_delayed_work(devfreq_wq, &devfreq_work,
> +				   msecs_to_jiffies(DEVFREQ_INTERVAL));
> +	} else {
> +		polling = false;
> +	}

OK, so why exactly continue_polling is needed?  It seems you might siply use
"polling" instead of it above.

> +
> +	mutex_unlock(&devfreq_list_lock);
> +}
...
> +/**
> + * devfreq_update() - Notify that the device OPP has been changed.
> + * @dev:	the device whose OPP has been changed.
> + * @may_not_exist:	do not print error message even if the device
> + *			does not have devfreq entry.

This argument isn't used any more.

> + */
> +int devfreq_update(struct device *dev)
> +{
> +	struct devfreq *devfreq;
> +	int err = 0;
> +
> +	mutex_lock(&devfreq_list_lock);
> +
> +	devfreq = find_device_devfreq(dev);
> +	if (IS_ERR(devfreq)) {
> +		err = PTR_ERR(devfreq);
> +		goto out;
> +	}
> +
> +	if (devfreq->tickle) {
> +		/* If the max freq available is changed, re-tickle */

It would be good to explain what "re-tickle" means.

> +		unsigned long freq = devfreq->profile->max_freq;
> +		struct opp *opp = opp_find_freq_floor(devfreq->dev, &freq);
> +
> +		if (IS_ERR(opp)) {
> +			err = PTR_ERR(opp);
> +			goto out;
> +		}
> +
> +		/* Max freq available is not changed */
> +		if (devfreq->previous_freq == freq)
> +			goto out;
> +
> +		err = devfreq->profile->target(devfreq->dev, opp);
> +		if (!err)
> +			devfreq->previous_freq = freq;

Why don't we run devfreq_do() in this case?

> +	} else {
> +		/* Reevaluate the proper frequency */
> +		err = devfreq_do(devfreq);
> +	}
> +
> +out:
> +	mutex_unlock(&devfreq_list_lock);
> +	return err;
> +}
> +

Add a kerneldoc comment here, please.

> +static int _devfreq_tickle_device(struct devfreq *df, unsigned long delay)
> +{
> +	int err = 0;
> +	unsigned long freq;
> +	struct opp *opp;
> +
> +	freq = df->profile->max_freq;
> +	opp = opp_find_freq_floor(df->dev, &freq);
> +	if (IS_ERR(opp))
> +		return PTR_ERR(opp);
> +
> +	if (df->previous_freq != freq) {
> +		err = df->profile->target(df->dev, opp);
> +		if (!err)
> +			df->previous_freq = freq;
> +	}
> +	if (err) {
> +		dev_err(df->dev, "%s: Cannot set frequency.\n", __func__);
> +	} else {
> +		df->tickle = delay;
> +		df->num_tickle++;
> +	}
> +
> +	if (devfreq_wq && !polling) {
> +		polling = true;
> +		queue_delayed_work(devfreq_wq, &devfreq_work,
> +				msecs_to_jiffies(DEVFREQ_INTERVAL));
> +	}
> +
> +	return err;
> +}

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 2/3] PM / DEVFREQ: add example governors
  2011-05-27  4:53 ` [PATCH v3 2/3] PM / DEVFREQ: add example governors MyungJoo Ham
@ 2011-07-02 21:58   ` Rafael J. Wysocki
  2011-07-04  1:37       ` MyungJoo Ham
  2011-07-02 21:58   ` Rafael J. Wysocki
  1 sibling, 1 reply; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-07-02 21:58 UTC (permalink / raw)
  To: MyungJoo Ham
  Cc: linux-pm, linux-kernel, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kyungmin Park, Jiejing Zhang, Colin Cross,
	Nishanth Menon, Thomas Gleixner, myungjoo.ham

Hi,

On Friday, May 27, 2011, MyungJoo Ham wrote:
> Three CPUFREQ-like governors are provided as examples.
> 
> powersave: use the lowest frequency possible. The user (device) should
> set the polling_ms as 0 because polling is useless for this governor.
> 
> performance: use the highest freqeuncy possible. The user (device)
> should set the polling_ms as 0 because polling is useless for this
> governor.
> 
> simple_ondemand: simplified version of CPUFREQ's ONDEMAND governor.
> 
> When a user updates OPP entries (enable/disable/add), OPP framework
> automatically notifies DEVFREQ to update operating frequency
> accordingly. Thus, DEVFREQ users (device drivers) do not need to update
> DEVFREQ manually with OPP entry updates or set polling_ms for powersave
> , performance, or any other "static" governors.
> 
> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> ---
>  drivers/base/power/devfreq.c |   69 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/devfreq.h      |    5 +++
>  2 files changed, 74 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/base/power/devfreq.c b/drivers/base/power/devfreq.c
> index 883d953..7648a94 100644
> --- a/drivers/base/power/devfreq.c
> +++ b/drivers/base/power/devfreq.c
> @@ -395,3 +395,72 @@ static int __init devfreq_init(void)
>  	return 0;
>  }
>  late_initcall(devfreq_init);
> +
> +static int devfreq_powersave_func(struct devfreq *df,
> +				  unsigned long *freq)
> +{
> +	*freq = 0; /* devfreq_do will run "ceiling" to 0 */
> +	return 0;
> +}
> +
> +struct devfreq_governor devfreq_powersave = {
> +	.get_target_freq = devfreq_powersave_func,
> +};
> +
> +static int devfreq_performance_func(struct devfreq *df,
> +				    unsigned long *freq)
> +{
> +	*freq = UINT_MAX; /* devfreq_do will run "floor" */
> +	return 0;
> +}
> +
> +struct devfreq_governor devfreq_performance = {
> +	.get_target_freq = devfreq_performance_func,
> +};
> +
> +/* Constants for DevFreq-Simple-Ondemand (DFSO) */
> +#define DFSO_UPTHRESHOLD	(90)
> +#define DFSO_DOWNDIFFERENCTIAL	(5)
> +static int devfreq_simple_ondemand_func(struct devfreq *df,
> +					unsigned long *freq)
> +{
> +	struct devfreq_dev_status stat;
> +	int err = df->profile->get_dev_status(df->dev, &stat);
> +	unsigned long long a, b;
> +
> +	if (err)
> +		return err;
> +
> +	/* Set MAX if it's busy enough */
> +	if (stat.busy_time * 100 >
> +	    stat.total_time * DFSO_UPTHRESHOLD) {
> +		*freq = UINT_MAX;
> +		return 0;
> +	}
> +
> +	/* Set MAX if we do not know the initial frequency */
> +	if (stat.current_frequency == 0) {
> +		*freq = UINT_MAX;
> +		return 0;
> +	}
> +
> +	/* Keep the current frequency */
> +	if (stat.busy_time * 100 >
> +	    stat.total_time * (DFSO_UPTHRESHOLD - DFSO_DOWNDIFFERENCTIAL)) {
> +		*freq = stat.current_frequency;
> +		return 0;
> +	}
> +
> +	/* Set the desired frequency based on the load */
> +	a = (unsigned long long) stat.busy_time * stat.current_frequency;

What's the purpose of the conversion?

> +	b = div_u64(a, stat.total_time);
> +	b *= 100;
> +	b = div_u64(b, (DFSO_UPTHRESHOLD - DFSO_DOWNDIFFERENCTIAL / 2));
> +	*freq = (unsigned long) b;
> +
> +	return 0;
> +}

Rafael

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 2/3] PM / DEVFREQ: add example governors
  2011-05-27  4:53 ` [PATCH v3 2/3] PM / DEVFREQ: add example governors MyungJoo Ham
  2011-07-02 21:58   ` Rafael J. Wysocki
@ 2011-07-02 21:58   ` Rafael J. Wysocki
  1 sibling, 0 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-07-02 21:58 UTC (permalink / raw)
  To: MyungJoo Ham
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	linux-pm, Thomas Gleixner

Hi,

On Friday, May 27, 2011, MyungJoo Ham wrote:
> Three CPUFREQ-like governors are provided as examples.
> 
> powersave: use the lowest frequency possible. The user (device) should
> set the polling_ms as 0 because polling is useless for this governor.
> 
> performance: use the highest freqeuncy possible. The user (device)
> should set the polling_ms as 0 because polling is useless for this
> governor.
> 
> simple_ondemand: simplified version of CPUFREQ's ONDEMAND governor.
> 
> When a user updates OPP entries (enable/disable/add), OPP framework
> automatically notifies DEVFREQ to update operating frequency
> accordingly. Thus, DEVFREQ users (device drivers) do not need to update
> DEVFREQ manually with OPP entry updates or set polling_ms for powersave
> , performance, or any other "static" governors.
> 
> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> ---
>  drivers/base/power/devfreq.c |   69 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/devfreq.h      |    5 +++
>  2 files changed, 74 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/base/power/devfreq.c b/drivers/base/power/devfreq.c
> index 883d953..7648a94 100644
> --- a/drivers/base/power/devfreq.c
> +++ b/drivers/base/power/devfreq.c
> @@ -395,3 +395,72 @@ static int __init devfreq_init(void)
>  	return 0;
>  }
>  late_initcall(devfreq_init);
> +
> +static int devfreq_powersave_func(struct devfreq *df,
> +				  unsigned long *freq)
> +{
> +	*freq = 0; /* devfreq_do will run "ceiling" to 0 */
> +	return 0;
> +}
> +
> +struct devfreq_governor devfreq_powersave = {
> +	.get_target_freq = devfreq_powersave_func,
> +};
> +
> +static int devfreq_performance_func(struct devfreq *df,
> +				    unsigned long *freq)
> +{
> +	*freq = UINT_MAX; /* devfreq_do will run "floor" */
> +	return 0;
> +}
> +
> +struct devfreq_governor devfreq_performance = {
> +	.get_target_freq = devfreq_performance_func,
> +};
> +
> +/* Constants for DevFreq-Simple-Ondemand (DFSO) */
> +#define DFSO_UPTHRESHOLD	(90)
> +#define DFSO_DOWNDIFFERENCTIAL	(5)
> +static int devfreq_simple_ondemand_func(struct devfreq *df,
> +					unsigned long *freq)
> +{
> +	struct devfreq_dev_status stat;
> +	int err = df->profile->get_dev_status(df->dev, &stat);
> +	unsigned long long a, b;
> +
> +	if (err)
> +		return err;
> +
> +	/* Set MAX if it's busy enough */
> +	if (stat.busy_time * 100 >
> +	    stat.total_time * DFSO_UPTHRESHOLD) {
> +		*freq = UINT_MAX;
> +		return 0;
> +	}
> +
> +	/* Set MAX if we do not know the initial frequency */
> +	if (stat.current_frequency == 0) {
> +		*freq = UINT_MAX;
> +		return 0;
> +	}
> +
> +	/* Keep the current frequency */
> +	if (stat.busy_time * 100 >
> +	    stat.total_time * (DFSO_UPTHRESHOLD - DFSO_DOWNDIFFERENCTIAL)) {
> +		*freq = stat.current_frequency;
> +		return 0;
> +	}
> +
> +	/* Set the desired frequency based on the load */
> +	a = (unsigned long long) stat.busy_time * stat.current_frequency;

What's the purpose of the conversion?

> +	b = div_u64(a, stat.total_time);
> +	b *= 100;
> +	b = div_u64(b, (DFSO_UPTHRESHOLD - DFSO_DOWNDIFFERENCTIAL / 2));
> +	*freq = (unsigned long) b;
> +
> +	return 0;
> +}

Rafael

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 3/3] PM / DEVFREQ: add sysfs interface (including user tickling)
  2011-05-27  4:53 ` [PATCH v3 3/3] PM / DEVFREQ: add sysfs interface (including user tickling) MyungJoo Ham
  2011-07-02 22:13   ` Rafael J. Wysocki
@ 2011-07-02 22:13   ` Rafael J. Wysocki
  2011-07-04  7:15     ` MyungJoo Ham
  2011-07-04  7:15     ` MyungJoo Ham
  1 sibling, 2 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-07-02 22:13 UTC (permalink / raw)
  To: MyungJoo Ham
  Cc: linux-pm, linux-kernel, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kyungmin Park, Jiejing Zhang, Colin Cross,
	Nishanth Menon, Thomas Gleixner, myungjoo.ham

On Friday, May 27, 2011, MyungJoo Ham wrote:
> 1. System-wide sysfs interface
> - tickle_all	R: number of tickle_all execution
> 		W: tickle all devfreq devices
> - min_interval	R: devfreq monitoring base interval in ms
> - monitoring	R: shows whether devfreq monitoring is active or
>   not.
> 
> 2. Device specific sysfs interface
> - tickle	R: number of tickle execution for the device
> 		W: tickle the device
> 
> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> 
> --
> Changed from v2
> - add ABI entries for devfreq sysfs interface
> ---
>  Documentation/ABI/testing/sysfs-devices-devfreq |   21 ++++
>  Documentation/ABI/testing/sysfs-power           |   43 ++++++++
>  drivers/base/power/devfreq.c                    |  133 ++++++++++++++++++++++-
>  include/linux/devfreq.h                         |    3 +
>  4 files changed, 199 insertions(+), 1 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-devices-devfreq
> 
> diff --git a/Documentation/ABI/testing/sysfs-devices-devfreq b/Documentation/ABI/testing/sysfs-devices-devfreq
> new file mode 100644
> index 0000000..7f35a64
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-devices-devfreq
> @@ -0,0 +1,21 @@
> +What:		/sys/devices/.../devfreq/
> +Date:		May 2011
> +Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
> +Description:
> +		The /sys/device/.../devfreq directory will contain files
> +		that provide interfaces to DEVFREQ for a specific device.
> +
> +What:		/sys/devices/.../devfreq/tickle
> +Date:		May 2011
> +Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
> +Description:
> +		The /sys/devices/.../devfreq/tickle file allows user space
> +		to force the corresponding device to operate at its maximum
> +		operable frequency instaneously and temporarily. After a
> +		designated duration has passed, the operating frequency returns
> +		to normal. When a user reads the tickle entry, it returns
> +		the number of tickle executions for the device. When a user
> +		writes to the tickle entry with the tickle duration in ms,
> +		the effect of device tickling is held for the designated
> +		duration. Note that the duration is rounded-up by
> +		the value DEVFREQ_INTERVAL defined in devfreq.c
> diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power
> index b464d12..4d8434b 100644
> --- a/Documentation/ABI/testing/sysfs-power
> +++ b/Documentation/ABI/testing/sysfs-power
> @@ -172,3 +172,46 @@ Description:
>  
>  		Reading from this file will display the current value, which is
>  		set to 1 MB by default.
> +
> +What:		/sys/power/devfreq/
> +Date:		May 2011
> +Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
> +Description:
> +		The /sys/power/devfreq directory will contain files that will
> +		provide a unified interface to the DEVFREQ, a generic DVFS
> +		(dynamic voltage and frequency scaling) framework.
> +
> +What:		/sys/power/devfreq/tickle_all
> +Date:		May 2011
> +Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
> +Description:
> +		The /sys/power/devfreq/tickle_all file allows user space to
> +		force every device with DEVFREQ to operate at the maximum
> +		frequency of the device instaneously and temporarily. After
> +		a designated delay has passed, the operating frequency returns
> +		to normal. If a user reads the tickle_all entry, it returns
> +		the number of tickle_all executions. When writing to the
> +		tickle_all entry, the user should supply with the duration of
> +		tickle in ms (the "designated delay" mentioned before). Then,
> +		the effect of tickle_all will hold for the denoted duration.
> +		Note that the duration is rounded by the monitoring period
> +		defined by DEVFREQ_INTERVAL in /drivers/base/power/devfreq.c.
> +
> +What:		/sys/power/devfreq/min_interval
> +Date:		May 2011
> +Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
> +Description:
> +		The /sys/power/devfreq/min_interval file shows the monitoring
> +		period defined by DEVFREQ_INTERVAL in
> +		/drivers/base/power/devfreq.c. The duration of device tickling
> +		is rounded-up by DEVFREQ_INTERVAL.
> +
> +What:		/sys/power/devfreq/monitoring
> +Date:		May 2011
> +Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
> +Description:
> +		The /sys/power/devfreq/monitoring file shows whether DEVFREQ
> +		is periodically monitoring. Periodic monitoring is activated
> +		if there is a device that wants periodic monitoring for DVFS or
> +		there is a device that is tickled (and the tickling duration is
> +		not yet expired).
> diff --git a/drivers/base/power/devfreq.c b/drivers/base/power/devfreq.c
> index 7648a94..709c138 100644
> --- a/drivers/base/power/devfreq.c
> +++ b/drivers/base/power/devfreq.c
> @@ -40,6 +40,9 @@ static LIST_HEAD(devfreq_list);
>  /* Exclusive access to devfreq_list and its elements */
>  static DEFINE_MUTEX(devfreq_list_lock);
>  
> +static struct kobject *devfreq_kobj;
> +static struct attribute_group dev_attr_group;
> +
>  /**
>   * find_device_devfreq() - find devfreq struct using device pointer
>   * @dev:	device pointer used to lookup device DEVFREQ.
> @@ -237,6 +240,8 @@ int devfreq_add_device(struct device *dev, struct devfreq_dev_profile *profile,
>  		queue_delayed_work(devfreq_wq, &devfreq_work,
>  				   msecs_to_jiffies(DEVFREQ_INTERVAL));
>  	}
> +
> +	sysfs_update_group(&dev->kobj, &dev_attr_group);

This appears to modify the attributes of the device, but anything like it is
not mentioned in the documentation.  What's up?

>  out:
>  	mutex_unlock(&devfreq_list_lock);
>  
> @@ -263,6 +268,8 @@ int devfreq_remove_device(struct device *dev)
>  		return -EINVAL;
>  	}
>  
> +	sysfs_remove_group(&dev->kobj, &dev_attr_group);
> +
>  	list_del(&devfreq->node);
>  
>  	kfree(devfreq);
> @@ -344,7 +351,7 @@ static int _devfreq_tickle_device(struct devfreq *df, unsigned long delay)
>  	if (devfreq_wq && !polling) {
>  		polling = true;
>  		queue_delayed_work(devfreq_wq, &devfreq_work,
> -				msecs_to_jiffies(DEVFREQ_INTERVAL));
> +				   msecs_to_jiffies(DEVFREQ_INTERVAL));

This change doesn't seem to belong to this patch.

>  	}
>  
>  	return err;
> @@ -382,6 +389,116 @@ int devfreq_tickle_device(struct device *dev, unsigned long duration_ms)
>  	return err;
>  }
>  
> +static int num_tickle_all;
> +static ssize_t tickle_all(struct device *dev, struct device_attribute *attr,
> +			  const char *buf, size_t count)
> +{
> +	int duration = 0;
> +	struct devfreq *tmp;
> +	unsigned long delay;
> +
> +	sscanf(buf, "%d", &duration);
> +	if (duration < DEVFREQ_INTERVAL)
> +		duration = DEVFREQ_INTERVAL;
> +
> +	if (unlikely(IS_ERR_OR_NULL(dev))) {
> +		pr_err("%s: Invalid parameters\n", __func__);

Please say "null device" instead of in addition to the above.

> +		return -EINVAL;
> +	}
> +
> +	delay = DIV_ROUND_UP(duration, DEVFREQ_INTERVAL);
> +
> +	mutex_lock(&devfreq_list_lock);
> +	list_for_each_entry(tmp, &devfreq_list, node) {
> +		_devfreq_tickle_device(tmp, delay);
> +	}
> +	mutex_unlock(&devfreq_list_lock);
> +
> +	num_tickle_all++;
> +	return count;
> +}
> +
> +static ssize_t show_num_tickle_all(struct device *dev,
> +				   struct device_attribute *attr, char *buf)
> +{
> +	return sprintf(buf, "%d\n", num_tickle_all);
> +}
> +
> +static ssize_t show_min_interval(struct device *dev,
> +				 struct device_attribute *attr, char *buf)
> +{
> +	return sprintf(buf, "%d\n", DEVFREQ_INTERVAL);
> +}
> +
> +static ssize_t show_monitoring(struct device *dev,
> +			       struct device_attribute *attr, char *buf)
> +{
> +	return sprintf(buf, "%d\n", monitoring ? 1 : 0);
> +}
> +
> +static DEVICE_ATTR(tickle_all, 0644, show_num_tickle_all, tickle_all);
> +static DEVICE_ATTR(min_interval, 0444, show_min_interval, NULL);
> +static DEVICE_ATTR(monitoring, 0444, show_monitoring, NULL);
> +static struct attribute *devfreq_entries[] = {
> +	&dev_attr_tickle_all.attr,
> +	&dev_attr_min_interval.attr,
> +	&dev_attr_monitoring.attr,
> +	NULL,
> +};
> +static struct attribute_group devfreq_attr_group = {
> +	.name	= NULL,
> +	.attrs	= devfreq_entries,
> +};
> +
> +static ssize_t tickle(struct device *dev, struct device_attribute *attr,
> +		      const char *buf, size_t count)
> +{
> +	int duration;
> +	struct devfreq *df;
> +	unsigned long delay;
> +
> +	sscanf(buf, "%d", &duration);
> +	if (duration < DEVFREQ_INTERVAL)
> +		duration = DEVFREQ_INTERVAL;
> +
> +	if (unlikely(IS_ERR_OR_NULL(dev))) {
> +		pr_err("%s: Invalid parameters\n", __func__);

Like above.

> +		return -EINVAL;
> +	}
> +
> +	delay = DIV_ROUND_UP(duration, DEVFREQ_INTERVAL);
> +
> +	mutex_lock(&devfreq_list_lock);
> +	df = find_device_devfreq(dev);
> +	_devfreq_tickle_device(df, delay);
> +	mutex_unlock(&devfreq_list_lock);
> +
> +	return count;
> +}
> +
> +static ssize_t show_num_tickle(struct device *dev,
> +			       struct device_attribute *attr, char *buf)
> +{
> +	struct devfreq *df;
> +
> +	df = find_device_devfreq(dev);
> +
> +	if (!IS_ERR(df))
> +		return sprintf(buf, "%d\n", df->num_tickle);
> +
> +	return PTR_ERR(df);
> +}
> +
> +static DEVICE_ATTR(tickle, 0644, show_num_tickle, tickle);
> +static struct attribute *dev_entries[] = {
> +	&dev_attr_tickle.attr,
> +	NULL,
> +};
> +static struct attribute_group dev_attr_group = {
> +	.name	= "devfreq",
> +	.attrs	= dev_entries,
> +};
> +
>  static int __init devfreq_init(void)
>  {
>  	mutex_lock(&devfreq_list_lock);
> @@ -389,6 +506,20 @@ static int __init devfreq_init(void)
>  	polling = false;
>  	devfreq_wq = create_freezable_workqueue("devfreq_wq");
>  	INIT_DELAYED_WORK_DEFERRABLE(&devfreq_work, devfreq_monitor);
> +
> +#ifdef CONFIG_PM
> +	/* Create sysfs */
> +	devfreq_kobj = kobject_create_and_add("devfreq", power_kobj);

Hmm, so power_kobj is global?  It shouldn't be.

Generally, whatever adds attributes to /sys/power should be located in
kernel/power/ .

> +	if (!devfreq_kobj) {
> +		pr_err("Unable to create DEVFREQ kobject.\n");
> +		goto out;
> +	}
> +	if (sysfs_create_group(devfreq_kobj, &devfreq_attr_group)) {
> +		pr_err("Unable to create DEVFREQ sysfs entries.\n");
> +		goto out;
> +	}
> +#endif
> +out:
>  	mutex_unlock(&devfreq_list_lock);
>  
>  	devfreq_monitor(&devfreq_work.work);
> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> index 1ec9a40..69334e7 100644
> --- a/include/linux/devfreq.h
> +++ b/include/linux/devfreq.h
> @@ -59,6 +59,7 @@ struct devfreq_governor {
>   *		at each executino of devfreq_monitor, tickle is decremented.
>   *		User may tickle a device-devfreq in order to set maximum
>   *		frequency instaneously with some guaranteed duration.
> + * @num_tickle	number of tickle calls.
>   *
>   * This structure stores the DEVFREQ information for a give device.
>   */
> @@ -72,6 +73,8 @@ struct devfreq {
>  	unsigned long previous_freq;
>  	unsigned int next_polling;
>  	unsigned int tickle;
> +
> +	unsigned int num_tickle;
>  };
>  
>  #if defined(CONFIG_PM_DEVFREQ)

It looks like the above two changes should be moved to a separate patch.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 3/3] PM / DEVFREQ: add sysfs interface (including user tickling)
  2011-05-27  4:53 ` [PATCH v3 3/3] PM / DEVFREQ: add sysfs interface (including user tickling) MyungJoo Ham
@ 2011-07-02 22:13   ` Rafael J. Wysocki
  2011-07-02 22:13   ` Rafael J. Wysocki
  1 sibling, 0 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-07-02 22:13 UTC (permalink / raw)
  To: MyungJoo Ham
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	linux-pm, Thomas Gleixner

On Friday, May 27, 2011, MyungJoo Ham wrote:
> 1. System-wide sysfs interface
> - tickle_all	R: number of tickle_all execution
> 		W: tickle all devfreq devices
> - min_interval	R: devfreq monitoring base interval in ms
> - monitoring	R: shows whether devfreq monitoring is active or
>   not.
> 
> 2. Device specific sysfs interface
> - tickle	R: number of tickle execution for the device
> 		W: tickle the device
> 
> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> 
> --
> Changed from v2
> - add ABI entries for devfreq sysfs interface
> ---
>  Documentation/ABI/testing/sysfs-devices-devfreq |   21 ++++
>  Documentation/ABI/testing/sysfs-power           |   43 ++++++++
>  drivers/base/power/devfreq.c                    |  133 ++++++++++++++++++++++-
>  include/linux/devfreq.h                         |    3 +
>  4 files changed, 199 insertions(+), 1 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-devices-devfreq
> 
> diff --git a/Documentation/ABI/testing/sysfs-devices-devfreq b/Documentation/ABI/testing/sysfs-devices-devfreq
> new file mode 100644
> index 0000000..7f35a64
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-devices-devfreq
> @@ -0,0 +1,21 @@
> +What:		/sys/devices/.../devfreq/
> +Date:		May 2011
> +Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
> +Description:
> +		The /sys/device/.../devfreq directory will contain files
> +		that provide interfaces to DEVFREQ for a specific device.
> +
> +What:		/sys/devices/.../devfreq/tickle
> +Date:		May 2011
> +Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
> +Description:
> +		The /sys/devices/.../devfreq/tickle file allows user space
> +		to force the corresponding device to operate at its maximum
> +		operable frequency instaneously and temporarily. After a
> +		designated duration has passed, the operating frequency returns
> +		to normal. When a user reads the tickle entry, it returns
> +		the number of tickle executions for the device. When a user
> +		writes to the tickle entry with the tickle duration in ms,
> +		the effect of device tickling is held for the designated
> +		duration. Note that the duration is rounded-up by
> +		the value DEVFREQ_INTERVAL defined in devfreq.c
> diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power
> index b464d12..4d8434b 100644
> --- a/Documentation/ABI/testing/sysfs-power
> +++ b/Documentation/ABI/testing/sysfs-power
> @@ -172,3 +172,46 @@ Description:
>  
>  		Reading from this file will display the current value, which is
>  		set to 1 MB by default.
> +
> +What:		/sys/power/devfreq/
> +Date:		May 2011
> +Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
> +Description:
> +		The /sys/power/devfreq directory will contain files that will
> +		provide a unified interface to the DEVFREQ, a generic DVFS
> +		(dynamic voltage and frequency scaling) framework.
> +
> +What:		/sys/power/devfreq/tickle_all
> +Date:		May 2011
> +Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
> +Description:
> +		The /sys/power/devfreq/tickle_all file allows user space to
> +		force every device with DEVFREQ to operate at the maximum
> +		frequency of the device instaneously and temporarily. After
> +		a designated delay has passed, the operating frequency returns
> +		to normal. If a user reads the tickle_all entry, it returns
> +		the number of tickle_all executions. When writing to the
> +		tickle_all entry, the user should supply with the duration of
> +		tickle in ms (the "designated delay" mentioned before). Then,
> +		the effect of tickle_all will hold for the denoted duration.
> +		Note that the duration is rounded by the monitoring period
> +		defined by DEVFREQ_INTERVAL in /drivers/base/power/devfreq.c.
> +
> +What:		/sys/power/devfreq/min_interval
> +Date:		May 2011
> +Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
> +Description:
> +		The /sys/power/devfreq/min_interval file shows the monitoring
> +		period defined by DEVFREQ_INTERVAL in
> +		/drivers/base/power/devfreq.c. The duration of device tickling
> +		is rounded-up by DEVFREQ_INTERVAL.
> +
> +What:		/sys/power/devfreq/monitoring
> +Date:		May 2011
> +Contact:	MyungJoo Ham <myungjoo.ham@samsung.com>
> +Description:
> +		The /sys/power/devfreq/monitoring file shows whether DEVFREQ
> +		is periodically monitoring. Periodic monitoring is activated
> +		if there is a device that wants periodic monitoring for DVFS or
> +		there is a device that is tickled (and the tickling duration is
> +		not yet expired).
> diff --git a/drivers/base/power/devfreq.c b/drivers/base/power/devfreq.c
> index 7648a94..709c138 100644
> --- a/drivers/base/power/devfreq.c
> +++ b/drivers/base/power/devfreq.c
> @@ -40,6 +40,9 @@ static LIST_HEAD(devfreq_list);
>  /* Exclusive access to devfreq_list and its elements */
>  static DEFINE_MUTEX(devfreq_list_lock);
>  
> +static struct kobject *devfreq_kobj;
> +static struct attribute_group dev_attr_group;
> +
>  /**
>   * find_device_devfreq() - find devfreq struct using device pointer
>   * @dev:	device pointer used to lookup device DEVFREQ.
> @@ -237,6 +240,8 @@ int devfreq_add_device(struct device *dev, struct devfreq_dev_profile *profile,
>  		queue_delayed_work(devfreq_wq, &devfreq_work,
>  				   msecs_to_jiffies(DEVFREQ_INTERVAL));
>  	}
> +
> +	sysfs_update_group(&dev->kobj, &dev_attr_group);

This appears to modify the attributes of the device, but anything like it is
not mentioned in the documentation.  What's up?

>  out:
>  	mutex_unlock(&devfreq_list_lock);
>  
> @@ -263,6 +268,8 @@ int devfreq_remove_device(struct device *dev)
>  		return -EINVAL;
>  	}
>  
> +	sysfs_remove_group(&dev->kobj, &dev_attr_group);
> +
>  	list_del(&devfreq->node);
>  
>  	kfree(devfreq);
> @@ -344,7 +351,7 @@ static int _devfreq_tickle_device(struct devfreq *df, unsigned long delay)
>  	if (devfreq_wq && !polling) {
>  		polling = true;
>  		queue_delayed_work(devfreq_wq, &devfreq_work,
> -				msecs_to_jiffies(DEVFREQ_INTERVAL));
> +				   msecs_to_jiffies(DEVFREQ_INTERVAL));

This change doesn't seem to belong to this patch.

>  	}
>  
>  	return err;
> @@ -382,6 +389,116 @@ int devfreq_tickle_device(struct device *dev, unsigned long duration_ms)
>  	return err;
>  }
>  
> +static int num_tickle_all;
> +static ssize_t tickle_all(struct device *dev, struct device_attribute *attr,
> +			  const char *buf, size_t count)
> +{
> +	int duration = 0;
> +	struct devfreq *tmp;
> +	unsigned long delay;
> +
> +	sscanf(buf, "%d", &duration);
> +	if (duration < DEVFREQ_INTERVAL)
> +		duration = DEVFREQ_INTERVAL;
> +
> +	if (unlikely(IS_ERR_OR_NULL(dev))) {
> +		pr_err("%s: Invalid parameters\n", __func__);

Please say "null device" instead of in addition to the above.

> +		return -EINVAL;
> +	}
> +
> +	delay = DIV_ROUND_UP(duration, DEVFREQ_INTERVAL);
> +
> +	mutex_lock(&devfreq_list_lock);
> +	list_for_each_entry(tmp, &devfreq_list, node) {
> +		_devfreq_tickle_device(tmp, delay);
> +	}
> +	mutex_unlock(&devfreq_list_lock);
> +
> +	num_tickle_all++;
> +	return count;
> +}
> +
> +static ssize_t show_num_tickle_all(struct device *dev,
> +				   struct device_attribute *attr, char *buf)
> +{
> +	return sprintf(buf, "%d\n", num_tickle_all);
> +}
> +
> +static ssize_t show_min_interval(struct device *dev,
> +				 struct device_attribute *attr, char *buf)
> +{
> +	return sprintf(buf, "%d\n", DEVFREQ_INTERVAL);
> +}
> +
> +static ssize_t show_monitoring(struct device *dev,
> +			       struct device_attribute *attr, char *buf)
> +{
> +	return sprintf(buf, "%d\n", monitoring ? 1 : 0);
> +}
> +
> +static DEVICE_ATTR(tickle_all, 0644, show_num_tickle_all, tickle_all);
> +static DEVICE_ATTR(min_interval, 0444, show_min_interval, NULL);
> +static DEVICE_ATTR(monitoring, 0444, show_monitoring, NULL);
> +static struct attribute *devfreq_entries[] = {
> +	&dev_attr_tickle_all.attr,
> +	&dev_attr_min_interval.attr,
> +	&dev_attr_monitoring.attr,
> +	NULL,
> +};
> +static struct attribute_group devfreq_attr_group = {
> +	.name	= NULL,
> +	.attrs	= devfreq_entries,
> +};
> +
> +static ssize_t tickle(struct device *dev, struct device_attribute *attr,
> +		      const char *buf, size_t count)
> +{
> +	int duration;
> +	struct devfreq *df;
> +	unsigned long delay;
> +
> +	sscanf(buf, "%d", &duration);
> +	if (duration < DEVFREQ_INTERVAL)
> +		duration = DEVFREQ_INTERVAL;
> +
> +	if (unlikely(IS_ERR_OR_NULL(dev))) {
> +		pr_err("%s: Invalid parameters\n", __func__);

Like above.

> +		return -EINVAL;
> +	}
> +
> +	delay = DIV_ROUND_UP(duration, DEVFREQ_INTERVAL);
> +
> +	mutex_lock(&devfreq_list_lock);
> +	df = find_device_devfreq(dev);
> +	_devfreq_tickle_device(df, delay);
> +	mutex_unlock(&devfreq_list_lock);
> +
> +	return count;
> +}
> +
> +static ssize_t show_num_tickle(struct device *dev,
> +			       struct device_attribute *attr, char *buf)
> +{
> +	struct devfreq *df;
> +
> +	df = find_device_devfreq(dev);
> +
> +	if (!IS_ERR(df))
> +		return sprintf(buf, "%d\n", df->num_tickle);
> +
> +	return PTR_ERR(df);
> +}
> +
> +static DEVICE_ATTR(tickle, 0644, show_num_tickle, tickle);
> +static struct attribute *dev_entries[] = {
> +	&dev_attr_tickle.attr,
> +	NULL,
> +};
> +static struct attribute_group dev_attr_group = {
> +	.name	= "devfreq",
> +	.attrs	= dev_entries,
> +};
> +
>  static int __init devfreq_init(void)
>  {
>  	mutex_lock(&devfreq_list_lock);
> @@ -389,6 +506,20 @@ static int __init devfreq_init(void)
>  	polling = false;
>  	devfreq_wq = create_freezable_workqueue("devfreq_wq");
>  	INIT_DELAYED_WORK_DEFERRABLE(&devfreq_work, devfreq_monitor);
> +
> +#ifdef CONFIG_PM
> +	/* Create sysfs */
> +	devfreq_kobj = kobject_create_and_add("devfreq", power_kobj);

Hmm, so power_kobj is global?  It shouldn't be.

Generally, whatever adds attributes to /sys/power should be located in
kernel/power/ .

> +	if (!devfreq_kobj) {
> +		pr_err("Unable to create DEVFREQ kobject.\n");
> +		goto out;
> +	}
> +	if (sysfs_create_group(devfreq_kobj, &devfreq_attr_group)) {
> +		pr_err("Unable to create DEVFREQ sysfs entries.\n");
> +		goto out;
> +	}
> +#endif
> +out:
>  	mutex_unlock(&devfreq_list_lock);
>  
>  	devfreq_monitor(&devfreq_work.work);
> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> index 1ec9a40..69334e7 100644
> --- a/include/linux/devfreq.h
> +++ b/include/linux/devfreq.h
> @@ -59,6 +59,7 @@ struct devfreq_governor {
>   *		at each executino of devfreq_monitor, tickle is decremented.
>   *		User may tickle a device-devfreq in order to set maximum
>   *		frequency instaneously with some guaranteed duration.
> + * @num_tickle	number of tickle calls.
>   *
>   * This structure stores the DEVFREQ information for a give device.
>   */
> @@ -72,6 +73,8 @@ struct devfreq {
>  	unsigned long previous_freq;
>  	unsigned int next_polling;
>  	unsigned int tickle;
> +
> +	unsigned int num_tickle;
>  };
>  
>  #if defined(CONFIG_PM_DEVFREQ)

It looks like the above two changes should be moved to a separate patch.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 2/3] PM / DEVFREQ: add example governors
  2011-07-02 21:58   ` Rafael J. Wysocki
@ 2011-07-04  1:37       ` MyungJoo Ham
  0 siblings, 0 replies; 35+ messages in thread
From: MyungJoo Ham @ 2011-07-04  1:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, linux-kernel, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kyungmin Park, Jiejing Zhang, Colin Cross,
	Nishanth Menon, Thomas Gleixner

Hello,

2011/7/3 Rafael J. Wysocki <rjw@sisk.pl>:
> Hi,
>
> On Friday, May 27, 2011, MyungJoo Ham wrote:
>> Three CPUFREQ-like governors are provided as examples.
>>
>> powersave: use the lowest frequency possible. The user (device) should
>> set the polling_ms as 0 because polling is useless for this governor.
>>
>> performance: use the highest freqeuncy possible. The user (device)
>> should set the polling_ms as 0 because polling is useless for this
>> governor.
>>
>> simple_ondemand: simplified version of CPUFREQ's ONDEMAND governor.
>>
>> When a user updates OPP entries (enable/disable/add), OPP framework
>> automatically notifies DEVFREQ to update operating frequency
>> accordingly. Thus, DEVFREQ users (device drivers) do not need to update
>> DEVFREQ manually with OPP entry updates or set polling_ms for powersave
>> , performance, or any other "static" governors.
>>
>> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
[]
>> +
>> +     /* Set the desired frequency based on the load */
>> +     a = (unsigned long long) stat.busy_time * stat.current_frequency;
>
> What's the purpose of the conversion?

Assuming that the work speed of a device is proportional to its
frequency, it measures the amount of work done.
It's time * work/time. For example, during the last 10 second, if the
busy_time was 5 sec and frequency was 10MHz,
it's "50M", which is same as 20MHz and 2.5 sec.

>
>> +     b = div_u64(a, stat.total_time);
>> +     b *= 100;
>> +     b = div_u64(b, (DFSO_UPTHRESHOLD - DFSO_DOWNDIFFERENCTIAL / 2));
>> +     *freq = (unsigned long) b;
>> +
>> +     return 0;
>> +}
>
> Rafael
>



-- 
MyungJoo Ham (함명주), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 2/3] PM / DEVFREQ: add example governors
@ 2011-07-04  1:37       ` MyungJoo Ham
  0 siblings, 0 replies; 35+ messages in thread
From: MyungJoo Ham @ 2011-07-04  1:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	linux-pm, Thomas Gleixner

Hello,

2011/7/3 Rafael J. Wysocki <rjw@sisk.pl>:
> Hi,
>
> On Friday, May 27, 2011, MyungJoo Ham wrote:
>> Three CPUFREQ-like governors are provided as examples.
>>
>> powersave: use the lowest frequency possible. The user (device) should
>> set the polling_ms as 0 because polling is useless for this governor.
>>
>> performance: use the highest freqeuncy possible. The user (device)
>> should set the polling_ms as 0 because polling is useless for this
>> governor.
>>
>> simple_ondemand: simplified version of CPUFREQ's ONDEMAND governor.
>>
>> When a user updates OPP entries (enable/disable/add), OPP framework
>> automatically notifies DEVFREQ to update operating frequency
>> accordingly. Thus, DEVFREQ users (device drivers) do not need to update
>> DEVFREQ manually with OPP entry updates or set polling_ms for powersave
>> , performance, or any other "static" governors.
>>
>> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
[]
>> +
>> +     /* Set the desired frequency based on the load */
>> +     a = (unsigned long long) stat.busy_time * stat.current_frequency;
>
> What's the purpose of the conversion?

Assuming that the work speed of a device is proportional to its
frequency, it measures the amount of work done.
It's time * work/time. For example, during the last 10 second, if the
busy_time was 5 sec and frequency was 10MHz,
it's "50M", which is same as 20MHz and 2.5 sec.

>
>> +     b = div_u64(a, stat.total_time);
>> +     b *= 100;
>> +     b = div_u64(b, (DFSO_UPTHRESHOLD - DFSO_DOWNDIFFERENCTIAL / 2));
>> +     *freq = (unsigned long) b;
>> +
>> +     return 0;
>> +}
>
> Rafael
>



-- 
MyungJoo Ham (함명주), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 3/3] PM / DEVFREQ: add sysfs interface (including user tickling)
  2011-07-02 22:13   ` Rafael J. Wysocki
@ 2011-07-04  7:15     ` MyungJoo Ham
  2011-07-04  8:48       ` Rafael J. Wysocki
  2011-07-04  8:48       ` Rafael J. Wysocki
  2011-07-04  7:15     ` MyungJoo Ham
  1 sibling, 2 replies; 35+ messages in thread
From: MyungJoo Ham @ 2011-07-04  7:15 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, linux-kernel, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kyungmin Park, Jiejing Zhang, Colin Cross,
	Nishanth Menon, Thomas Gleixner

Hello,

2011/7/3 Rafael J. Wysocki <rjw@sisk.pl>:
> On Friday, May 27, 2011, MyungJoo Ham wrote:
>> 1. System-wide sysfs interface
>> - tickle_all  R: number of tickle_all execution
>>               W: tickle all devfreq devices
>> - min_interval        R: devfreq monitoring base interval in ms
>> - monitoring  R: shows whether devfreq monitoring is active or
>>   not.
>>
>> 2. Device specific sysfs interface
>> - tickle      R: number of tickle execution for the device
>>               W: tickle the device
>>
>> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>>
>> --
>> Changed from v2
>> - add ABI entries for devfreq sysfs interface
>> ---
>>  Documentation/ABI/testing/sysfs-devices-devfreq |   21 ++++
>>  Documentation/ABI/testing/sysfs-power           |   43 ++++++++
>>  drivers/base/power/devfreq.c                    |  133 ++++++++++++++++++++++-
>>  include/linux/devfreq.h                         |    3 +
>>  4 files changed, 199 insertions(+), 1 deletions(-)
>>  create mode 100644 Documentation/ABI/testing/sysfs-devices-devfreq
>>
>> diff --git a/Documentation/ABI/testing/sysfs-devices-devfreq b/Documentation/ABI/testing/sysfs-devices-devfreq
>> new file mode 100644
>> index 0000000..7f35a64
>> --- /dev/null
>> +++ b/Documentation/ABI/testing/sysfs-devices-devfreq
>> @@ -0,0 +1,21 @@
>> +What:                /sys/devices/.../devfreq/
>> +Date:                May 2011
>> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
>> +Description:
>> +             The /sys/device/.../devfreq directory will contain files
>> +             that provide interfaces to DEVFREQ for a specific device.
>> +
>> +What:                /sys/devices/.../devfreq/tickle
>> +Date:                May 2011
>> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
>> +Description:
>> +             The /sys/devices/.../devfreq/tickle file allows user space
>> +             to force the corresponding device to operate at its maximum
>> +             operable frequency instaneously and temporarily. After a
>> +             designated duration has passed, the operating frequency returns
>> +             to normal. When a user reads the tickle entry, it returns
>> +             the number of tickle executions for the device. When a user
>> +             writes to the tickle entry with the tickle duration in ms,
>> +             the effect of device tickling is held for the designated
>> +             duration. Note that the duration is rounded-up by
>> +             the value DEVFREQ_INTERVAL defined in devfreq.c
>> diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power
>> index b464d12..4d8434b 100644
>> --- a/Documentation/ABI/testing/sysfs-power
>> +++ b/Documentation/ABI/testing/sysfs-power
>> @@ -172,3 +172,46 @@ Description:
>>
>>               Reading from this file will display the current value, which is
>>               set to 1 MB by default.
>> +
>> +What:                /sys/power/devfreq/
>> +Date:                May 2011
>> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
>> +Description:
>> +             The /sys/power/devfreq directory will contain files that will
>> +             provide a unified interface to the DEVFREQ, a generic DVFS
>> +             (dynamic voltage and frequency scaling) framework.
>> +
>> +What:                /sys/power/devfreq/tickle_all
>> +Date:                May 2011
>> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
>> +Description:
>> +             The /sys/power/devfreq/tickle_all file allows user space to
>> +             force every device with DEVFREQ to operate at the maximum
>> +             frequency of the device instaneously and temporarily. After
>> +             a designated delay has passed, the operating frequency returns
>> +             to normal. If a user reads the tickle_all entry, it returns
>> +             the number of tickle_all executions. When writing to the
>> +             tickle_all entry, the user should supply with the duration of
>> +             tickle in ms (the "designated delay" mentioned before). Then,
>> +             the effect of tickle_all will hold for the denoted duration.
>> +             Note that the duration is rounded by the monitoring period
>> +             defined by DEVFREQ_INTERVAL in /drivers/base/power/devfreq.c.
>> +
>> +What:                /sys/power/devfreq/min_interval
>> +Date:                May 2011
>> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
>> +Description:
>> +             The /sys/power/devfreq/min_interval file shows the monitoring
>> +             period defined by DEVFREQ_INTERVAL in
>> +             /drivers/base/power/devfreq.c. The duration of device tickling
>> +             is rounded-up by DEVFREQ_INTERVAL.
>> +
>> +What:                /sys/power/devfreq/monitoring
>> +Date:                May 2011
>> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
>> +Description:
>> +             The /sys/power/devfreq/monitoring file shows whether DEVFREQ
>> +             is periodically monitoring. Periodic monitoring is activated
>> +             if there is a device that wants periodic monitoring for DVFS or
>> +             there is a device that is tickled (and the tickling duration is
>> +             not yet expired).
>> diff --git a/drivers/base/power/devfreq.c b/drivers/base/power/devfreq.c
>> index 7648a94..709c138 100644
>> --- a/drivers/base/power/devfreq.c
>> +++ b/drivers/base/power/devfreq.c
>> @@ -40,6 +40,9 @@ static LIST_HEAD(devfreq_list);
>>  /* Exclusive access to devfreq_list and its elements */
>>  static DEFINE_MUTEX(devfreq_list_lock);
>>
>> +static struct kobject *devfreq_kobj;
>> +static struct attribute_group dev_attr_group;
>> +
>>  /**
>>   * find_device_devfreq() - find devfreq struct using device pointer
>>   * @dev:     device pointer used to lookup device DEVFREQ.
>> @@ -237,6 +240,8 @@ int devfreq_add_device(struct device *dev, struct devfreq_dev_profile *profile,
>>               queue_delayed_work(devfreq_wq, &devfreq_work,
>>                                  msecs_to_jiffies(DEVFREQ_INTERVAL));
>>       }
>> +
>> +     sysfs_update_group(&dev->kobj, &dev_attr_group);
>
> This appears to modify the attributes of the device, but anything like it is
> not mentioned in the documentation.  What's up?

It is mentioned "Documentation/ABI/testing/sysfs-devices-devfreq".

>
>>  out:
>>       mutex_unlock(&devfreq_list_lock);
>>
>> @@ -263,6 +268,8 @@ int devfreq_remove_device(struct device *dev)
>>               return -EINVAL;
>>       }
>>
>> +     sysfs_remove_group(&dev->kobj, &dev_attr_group);
>> +
>>       list_del(&devfreq->node);
>>
>>       kfree(devfreq);
>> @@ -344,7 +351,7 @@ static int _devfreq_tickle_device(struct devfreq *df, unsigned long delay)
>>       if (devfreq_wq && !polling) {
>>               polling = true;
>>               queue_delayed_work(devfreq_wq, &devfreq_work,
>> -                             msecs_to_jiffies(DEVFREQ_INTERVAL));
>> +                                msecs_to_jiffies(DEVFREQ_INTERVAL));
>
> This change doesn't seem to belong to this patch.

Oops. I'll rearrange the patch series.

>
>>       }
>>
>>       return err;
>> @@ -382,6 +389,116 @@ int devfreq_tickle_device(struct device *dev, unsigned long duration_ms)
>>       return err;
>>  }
>>
>> +static int num_tickle_all;
>> +static ssize_t tickle_all(struct device *dev, struct device_attribute *attr,
>> +                       const char *buf, size_t count)
>> +{
>> +     int duration = 0;
>> +     struct devfreq *tmp;
>> +     unsigned long delay;
>> +
>> +     sscanf(buf, "%d", &duration);
>> +     if (duration < DEVFREQ_INTERVAL)
>> +             duration = DEVFREQ_INTERVAL;
>> +
>> +     if (unlikely(IS_ERR_OR_NULL(dev))) {
>> +             pr_err("%s: Invalid parameters\n", __func__);
>
> Please say "null device" instead of in addition to the above.

Ok. I'll let it say "Null or invalid device" as it's IS_ERR_OR_NULL.

>
>> +             return -EINVAL;
>> +     }
>> +
>> +     delay = DIV_ROUND_UP(duration, DEVFREQ_INTERVAL);
>> +
>> +     mutex_lock(&devfreq_list_lock);
>> +     list_for_each_entry(tmp, &devfreq_list, node) {
>> +             _devfreq_tickle_device(tmp, delay);
>> +     }
>> +     mutex_unlock(&devfreq_list_lock);
>> +
>> +     num_tickle_all++;
>> +     return count;
>> +}
>> +
>> +static ssize_t show_num_tickle_all(struct device *dev,
>> +                                struct device_attribute *attr, char *buf)
>> +{
>> +     return sprintf(buf, "%d\n", num_tickle_all);
>> +}
>> +
>> +static ssize_t show_min_interval(struct device *dev,
>> +                              struct device_attribute *attr, char *buf)
>> +{
>> +     return sprintf(buf, "%d\n", DEVFREQ_INTERVAL);
>> +}
>> +
>> +static ssize_t show_monitoring(struct device *dev,
>> +                            struct device_attribute *attr, char *buf)
>> +{
>> +     return sprintf(buf, "%d\n", monitoring ? 1 : 0);
>> +}
>> +
>> +static DEVICE_ATTR(tickle_all, 0644, show_num_tickle_all, tickle_all);
>> +static DEVICE_ATTR(min_interval, 0444, show_min_interval, NULL);
>> +static DEVICE_ATTR(monitoring, 0444, show_monitoring, NULL);
>> +static struct attribute *devfreq_entries[] = {
>> +     &dev_attr_tickle_all.attr,
>> +     &dev_attr_min_interval.attr,
>> +     &dev_attr_monitoring.attr,
>> +     NULL,
>> +};
>> +static struct attribute_group devfreq_attr_group = {
>> +     .name   = NULL,
>> +     .attrs  = devfreq_entries,
>> +};
>> +
>> +static ssize_t tickle(struct device *dev, struct device_attribute *attr,
>> +                   const char *buf, size_t count)
>> +{
>> +     int duration;
>> +     struct devfreq *df;
>> +     unsigned long delay;
>> +
>> +     sscanf(buf, "%d", &duration);
>> +     if (duration < DEVFREQ_INTERVAL)
>> +             duration = DEVFREQ_INTERVAL;
>> +
>> +     if (unlikely(IS_ERR_OR_NULL(dev))) {
>> +             pr_err("%s: Invalid parameters\n", __func__);
>
> Like above.

Yup.

>
>> +             return -EINVAL;
>> +     }
>> +
>> +     delay = DIV_ROUND_UP(duration, DEVFREQ_INTERVAL);
>> +
>> +     mutex_lock(&devfreq_list_lock);
>> +     df = find_device_devfreq(dev);
>> +     _devfreq_tickle_device(df, delay);
>> +     mutex_unlock(&devfreq_list_lock);
>> +
>> +     return count;
>> +}
>> +
>> +static ssize_t show_num_tickle(struct device *dev,
>> +                            struct device_attribute *attr, char *buf)
>> +{
>> +     struct devfreq *df;
>> +
>> +     df = find_device_devfreq(dev);
>> +
>> +     if (!IS_ERR(df))
>> +             return sprintf(buf, "%d\n", df->num_tickle);
>> +
>> +     return PTR_ERR(df);
>> +}
>> +
>> +static DEVICE_ATTR(tickle, 0644, show_num_tickle, tickle);
>> +static struct attribute *dev_entries[] = {
>> +     &dev_attr_tickle.attr,
>> +     NULL,
>> +};
>> +static struct attribute_group dev_attr_group = {
>> +     .name   = "devfreq",
>> +     .attrs  = dev_entries,
>> +};
>> +
>>  static int __init devfreq_init(void)
>>  {
>>       mutex_lock(&devfreq_list_lock);
>> @@ -389,6 +506,20 @@ static int __init devfreq_init(void)
>>       polling = false;
>>       devfreq_wq = create_freezable_workqueue("devfreq_wq");
>>       INIT_DELAYED_WORK_DEFERRABLE(&devfreq_work, devfreq_monitor);
>> +
>> +#ifdef CONFIG_PM
>> +     /* Create sysfs */
>> +     devfreq_kobj = kobject_create_and_add("devfreq", power_kobj);
>
> Hmm, so power_kobj is global?  It shouldn't be.

Yes, it is global and it's declared at include/linux/kobject.h.

>
> Generally, whatever adds attributes to /sys/power should be located in
> kernel/power/ .

I've put it at /sys/power because these attributes are global to every
device with devfreq.

Anyway, if you think that's a weird location for it, could you please
give me some hints on where would be proper to locate system-wide
devfreq sysfs attributes?
Or, do you think kernel/power/devfreq.c would be a proper location for devfreq?

Or, what about allowing every /sys/devices/.../devfreq/global/* to be
the same "global" attributes of devfreq?

>
>> +     if (!devfreq_kobj) {
>> +             pr_err("Unable to create DEVFREQ kobject.\n");
>> +             goto out;
>> +     }
>> +     if (sysfs_create_group(devfreq_kobj, &devfreq_attr_group)) {
>> +             pr_err("Unable to create DEVFREQ sysfs entries.\n");
>> +             goto out;
>> +     }
>> +#endif
>> +out:
>>       mutex_unlock(&devfreq_list_lock);
>>
>>       devfreq_monitor(&devfreq_work.work);
>> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
>> index 1ec9a40..69334e7 100644
>> --- a/include/linux/devfreq.h
>> +++ b/include/linux/devfreq.h
>> @@ -59,6 +59,7 @@ struct devfreq_governor {
>>   *           at each executino of devfreq_monitor, tickle is decremented.
>>   *           User may tickle a device-devfreq in order to set maximum
>>   *           frequency instaneously with some guaranteed duration.
>> + * @num_tickle       number of tickle calls.
>>   *
>>   * This structure stores the DEVFREQ information for a give device.
>>   */
>> @@ -72,6 +73,8 @@ struct devfreq {
>>       unsigned long previous_freq;
>>       unsigned int next_polling;
>>       unsigned int tickle;
>> +
>> +     unsigned int num_tickle;
>>  };
>>
>>  #if defined(CONFIG_PM_DEVFREQ)
>
> It looks like the above two changes should be moved to a separate patch.

These two changes are for the sysfs interface as counting number of
tickle is added with sysfs interface.

>
> Thanks,
> Rafael
>


Thank you,
- MyungJoo
-- 
MyungJoo Ham (함명주), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 3/3] PM / DEVFREQ: add sysfs interface (including user tickling)
  2011-07-02 22:13   ` Rafael J. Wysocki
  2011-07-04  7:15     ` MyungJoo Ham
@ 2011-07-04  7:15     ` MyungJoo Ham
  1 sibling, 0 replies; 35+ messages in thread
From: MyungJoo Ham @ 2011-07-04  7:15 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	linux-pm, Thomas Gleixner

Hello,

2011/7/3 Rafael J. Wysocki <rjw@sisk.pl>:
> On Friday, May 27, 2011, MyungJoo Ham wrote:
>> 1. System-wide sysfs interface
>> - tickle_all  R: number of tickle_all execution
>>               W: tickle all devfreq devices
>> - min_interval        R: devfreq monitoring base interval in ms
>> - monitoring  R: shows whether devfreq monitoring is active or
>>   not.
>>
>> 2. Device specific sysfs interface
>> - tickle      R: number of tickle execution for the device
>>               W: tickle the device
>>
>> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>>
>> --
>> Changed from v2
>> - add ABI entries for devfreq sysfs interface
>> ---
>>  Documentation/ABI/testing/sysfs-devices-devfreq |   21 ++++
>>  Documentation/ABI/testing/sysfs-power           |   43 ++++++++
>>  drivers/base/power/devfreq.c                    |  133 ++++++++++++++++++++++-
>>  include/linux/devfreq.h                         |    3 +
>>  4 files changed, 199 insertions(+), 1 deletions(-)
>>  create mode 100644 Documentation/ABI/testing/sysfs-devices-devfreq
>>
>> diff --git a/Documentation/ABI/testing/sysfs-devices-devfreq b/Documentation/ABI/testing/sysfs-devices-devfreq
>> new file mode 100644
>> index 0000000..7f35a64
>> --- /dev/null
>> +++ b/Documentation/ABI/testing/sysfs-devices-devfreq
>> @@ -0,0 +1,21 @@
>> +What:                /sys/devices/.../devfreq/
>> +Date:                May 2011
>> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
>> +Description:
>> +             The /sys/device/.../devfreq directory will contain files
>> +             that provide interfaces to DEVFREQ for a specific device.
>> +
>> +What:                /sys/devices/.../devfreq/tickle
>> +Date:                May 2011
>> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
>> +Description:
>> +             The /sys/devices/.../devfreq/tickle file allows user space
>> +             to force the corresponding device to operate at its maximum
>> +             operable frequency instaneously and temporarily. After a
>> +             designated duration has passed, the operating frequency returns
>> +             to normal. When a user reads the tickle entry, it returns
>> +             the number of tickle executions for the device. When a user
>> +             writes to the tickle entry with the tickle duration in ms,
>> +             the effect of device tickling is held for the designated
>> +             duration. Note that the duration is rounded-up by
>> +             the value DEVFREQ_INTERVAL defined in devfreq.c
>> diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power
>> index b464d12..4d8434b 100644
>> --- a/Documentation/ABI/testing/sysfs-power
>> +++ b/Documentation/ABI/testing/sysfs-power
>> @@ -172,3 +172,46 @@ Description:
>>
>>               Reading from this file will display the current value, which is
>>               set to 1 MB by default.
>> +
>> +What:                /sys/power/devfreq/
>> +Date:                May 2011
>> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
>> +Description:
>> +             The /sys/power/devfreq directory will contain files that will
>> +             provide a unified interface to the DEVFREQ, a generic DVFS
>> +             (dynamic voltage and frequency scaling) framework.
>> +
>> +What:                /sys/power/devfreq/tickle_all
>> +Date:                May 2011
>> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
>> +Description:
>> +             The /sys/power/devfreq/tickle_all file allows user space to
>> +             force every device with DEVFREQ to operate at the maximum
>> +             frequency of the device instaneously and temporarily. After
>> +             a designated delay has passed, the operating frequency returns
>> +             to normal. If a user reads the tickle_all entry, it returns
>> +             the number of tickle_all executions. When writing to the
>> +             tickle_all entry, the user should supply with the duration of
>> +             tickle in ms (the "designated delay" mentioned before). Then,
>> +             the effect of tickle_all will hold for the denoted duration.
>> +             Note that the duration is rounded by the monitoring period
>> +             defined by DEVFREQ_INTERVAL in /drivers/base/power/devfreq.c.
>> +
>> +What:                /sys/power/devfreq/min_interval
>> +Date:                May 2011
>> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
>> +Description:
>> +             The /sys/power/devfreq/min_interval file shows the monitoring
>> +             period defined by DEVFREQ_INTERVAL in
>> +             /drivers/base/power/devfreq.c. The duration of device tickling
>> +             is rounded-up by DEVFREQ_INTERVAL.
>> +
>> +What:                /sys/power/devfreq/monitoring
>> +Date:                May 2011
>> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
>> +Description:
>> +             The /sys/power/devfreq/monitoring file shows whether DEVFREQ
>> +             is periodically monitoring. Periodic monitoring is activated
>> +             if there is a device that wants periodic monitoring for DVFS or
>> +             there is a device that is tickled (and the tickling duration is
>> +             not yet expired).
>> diff --git a/drivers/base/power/devfreq.c b/drivers/base/power/devfreq.c
>> index 7648a94..709c138 100644
>> --- a/drivers/base/power/devfreq.c
>> +++ b/drivers/base/power/devfreq.c
>> @@ -40,6 +40,9 @@ static LIST_HEAD(devfreq_list);
>>  /* Exclusive access to devfreq_list and its elements */
>>  static DEFINE_MUTEX(devfreq_list_lock);
>>
>> +static struct kobject *devfreq_kobj;
>> +static struct attribute_group dev_attr_group;
>> +
>>  /**
>>   * find_device_devfreq() - find devfreq struct using device pointer
>>   * @dev:     device pointer used to lookup device DEVFREQ.
>> @@ -237,6 +240,8 @@ int devfreq_add_device(struct device *dev, struct devfreq_dev_profile *profile,
>>               queue_delayed_work(devfreq_wq, &devfreq_work,
>>                                  msecs_to_jiffies(DEVFREQ_INTERVAL));
>>       }
>> +
>> +     sysfs_update_group(&dev->kobj, &dev_attr_group);
>
> This appears to modify the attributes of the device, but anything like it is
> not mentioned in the documentation.  What's up?

It is mentioned "Documentation/ABI/testing/sysfs-devices-devfreq".

>
>>  out:
>>       mutex_unlock(&devfreq_list_lock);
>>
>> @@ -263,6 +268,8 @@ int devfreq_remove_device(struct device *dev)
>>               return -EINVAL;
>>       }
>>
>> +     sysfs_remove_group(&dev->kobj, &dev_attr_group);
>> +
>>       list_del(&devfreq->node);
>>
>>       kfree(devfreq);
>> @@ -344,7 +351,7 @@ static int _devfreq_tickle_device(struct devfreq *df, unsigned long delay)
>>       if (devfreq_wq && !polling) {
>>               polling = true;
>>               queue_delayed_work(devfreq_wq, &devfreq_work,
>> -                             msecs_to_jiffies(DEVFREQ_INTERVAL));
>> +                                msecs_to_jiffies(DEVFREQ_INTERVAL));
>
> This change doesn't seem to belong to this patch.

Oops. I'll rearrange the patch series.

>
>>       }
>>
>>       return err;
>> @@ -382,6 +389,116 @@ int devfreq_tickle_device(struct device *dev, unsigned long duration_ms)
>>       return err;
>>  }
>>
>> +static int num_tickle_all;
>> +static ssize_t tickle_all(struct device *dev, struct device_attribute *attr,
>> +                       const char *buf, size_t count)
>> +{
>> +     int duration = 0;
>> +     struct devfreq *tmp;
>> +     unsigned long delay;
>> +
>> +     sscanf(buf, "%d", &duration);
>> +     if (duration < DEVFREQ_INTERVAL)
>> +             duration = DEVFREQ_INTERVAL;
>> +
>> +     if (unlikely(IS_ERR_OR_NULL(dev))) {
>> +             pr_err("%s: Invalid parameters\n", __func__);
>
> Please say "null device" instead of in addition to the above.

Ok. I'll let it say "Null or invalid device" as it's IS_ERR_OR_NULL.

>
>> +             return -EINVAL;
>> +     }
>> +
>> +     delay = DIV_ROUND_UP(duration, DEVFREQ_INTERVAL);
>> +
>> +     mutex_lock(&devfreq_list_lock);
>> +     list_for_each_entry(tmp, &devfreq_list, node) {
>> +             _devfreq_tickle_device(tmp, delay);
>> +     }
>> +     mutex_unlock(&devfreq_list_lock);
>> +
>> +     num_tickle_all++;
>> +     return count;
>> +}
>> +
>> +static ssize_t show_num_tickle_all(struct device *dev,
>> +                                struct device_attribute *attr, char *buf)
>> +{
>> +     return sprintf(buf, "%d\n", num_tickle_all);
>> +}
>> +
>> +static ssize_t show_min_interval(struct device *dev,
>> +                              struct device_attribute *attr, char *buf)
>> +{
>> +     return sprintf(buf, "%d\n", DEVFREQ_INTERVAL);
>> +}
>> +
>> +static ssize_t show_monitoring(struct device *dev,
>> +                            struct device_attribute *attr, char *buf)
>> +{
>> +     return sprintf(buf, "%d\n", monitoring ? 1 : 0);
>> +}
>> +
>> +static DEVICE_ATTR(tickle_all, 0644, show_num_tickle_all, tickle_all);
>> +static DEVICE_ATTR(min_interval, 0444, show_min_interval, NULL);
>> +static DEVICE_ATTR(monitoring, 0444, show_monitoring, NULL);
>> +static struct attribute *devfreq_entries[] = {
>> +     &dev_attr_tickle_all.attr,
>> +     &dev_attr_min_interval.attr,
>> +     &dev_attr_monitoring.attr,
>> +     NULL,
>> +};
>> +static struct attribute_group devfreq_attr_group = {
>> +     .name   = NULL,
>> +     .attrs  = devfreq_entries,
>> +};
>> +
>> +static ssize_t tickle(struct device *dev, struct device_attribute *attr,
>> +                   const char *buf, size_t count)
>> +{
>> +     int duration;
>> +     struct devfreq *df;
>> +     unsigned long delay;
>> +
>> +     sscanf(buf, "%d", &duration);
>> +     if (duration < DEVFREQ_INTERVAL)
>> +             duration = DEVFREQ_INTERVAL;
>> +
>> +     if (unlikely(IS_ERR_OR_NULL(dev))) {
>> +             pr_err("%s: Invalid parameters\n", __func__);
>
> Like above.

Yup.

>
>> +             return -EINVAL;
>> +     }
>> +
>> +     delay = DIV_ROUND_UP(duration, DEVFREQ_INTERVAL);
>> +
>> +     mutex_lock(&devfreq_list_lock);
>> +     df = find_device_devfreq(dev);
>> +     _devfreq_tickle_device(df, delay);
>> +     mutex_unlock(&devfreq_list_lock);
>> +
>> +     return count;
>> +}
>> +
>> +static ssize_t show_num_tickle(struct device *dev,
>> +                            struct device_attribute *attr, char *buf)
>> +{
>> +     struct devfreq *df;
>> +
>> +     df = find_device_devfreq(dev);
>> +
>> +     if (!IS_ERR(df))
>> +             return sprintf(buf, "%d\n", df->num_tickle);
>> +
>> +     return PTR_ERR(df);
>> +}
>> +
>> +static DEVICE_ATTR(tickle, 0644, show_num_tickle, tickle);
>> +static struct attribute *dev_entries[] = {
>> +     &dev_attr_tickle.attr,
>> +     NULL,
>> +};
>> +static struct attribute_group dev_attr_group = {
>> +     .name   = "devfreq",
>> +     .attrs  = dev_entries,
>> +};
>> +
>>  static int __init devfreq_init(void)
>>  {
>>       mutex_lock(&devfreq_list_lock);
>> @@ -389,6 +506,20 @@ static int __init devfreq_init(void)
>>       polling = false;
>>       devfreq_wq = create_freezable_workqueue("devfreq_wq");
>>       INIT_DELAYED_WORK_DEFERRABLE(&devfreq_work, devfreq_monitor);
>> +
>> +#ifdef CONFIG_PM
>> +     /* Create sysfs */
>> +     devfreq_kobj = kobject_create_and_add("devfreq", power_kobj);
>
> Hmm, so power_kobj is global?  It shouldn't be.

Yes, it is global and it's declared at include/linux/kobject.h.

>
> Generally, whatever adds attributes to /sys/power should be located in
> kernel/power/ .

I've put it at /sys/power because these attributes are global to every
device with devfreq.

Anyway, if you think that's a weird location for it, could you please
give me some hints on where would be proper to locate system-wide
devfreq sysfs attributes?
Or, do you think kernel/power/devfreq.c would be a proper location for devfreq?

Or, what about allowing every /sys/devices/.../devfreq/global/* to be
the same "global" attributes of devfreq?

>
>> +     if (!devfreq_kobj) {
>> +             pr_err("Unable to create DEVFREQ kobject.\n");
>> +             goto out;
>> +     }
>> +     if (sysfs_create_group(devfreq_kobj, &devfreq_attr_group)) {
>> +             pr_err("Unable to create DEVFREQ sysfs entries.\n");
>> +             goto out;
>> +     }
>> +#endif
>> +out:
>>       mutex_unlock(&devfreq_list_lock);
>>
>>       devfreq_monitor(&devfreq_work.work);
>> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
>> index 1ec9a40..69334e7 100644
>> --- a/include/linux/devfreq.h
>> +++ b/include/linux/devfreq.h
>> @@ -59,6 +59,7 @@ struct devfreq_governor {
>>   *           at each executino of devfreq_monitor, tickle is decremented.
>>   *           User may tickle a device-devfreq in order to set maximum
>>   *           frequency instaneously with some guaranteed duration.
>> + * @num_tickle       number of tickle calls.
>>   *
>>   * This structure stores the DEVFREQ information for a give device.
>>   */
>> @@ -72,6 +73,8 @@ struct devfreq {
>>       unsigned long previous_freq;
>>       unsigned int next_polling;
>>       unsigned int tickle;
>> +
>> +     unsigned int num_tickle;
>>  };
>>
>>  #if defined(CONFIG_PM_DEVFREQ)
>
> It looks like the above two changes should be moved to a separate patch.

These two changes are for the sysfs interface as counting number of
tickle is added with sysfs interface.

>
> Thanks,
> Rafael
>


Thank you,
- MyungJoo
-- 
MyungJoo Ham (함명주), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
  2011-07-02 21:56 ` Rafael J. Wysocki
@ 2011-07-04  8:34   ` MyungJoo Ham
  2011-07-04  8:57     ` Rafael J. Wysocki
  2011-07-04  8:57     ` Rafael J. Wysocki
  2011-07-04  8:34   ` MyungJoo Ham
  1 sibling, 2 replies; 35+ messages in thread
From: MyungJoo Ham @ 2011-07-04  8:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, linux-kernel, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kyungmin Park, Jiejing Zhang, Colin Cross,
	Nishanth Menon, Thomas Gleixner, Chanwoo Choi

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 9649 bytes --]

2011/7/3 Rafael J. Wysocki <rjw@sisk.pl>:
> Hi,
>
> I'm not sure if spelling DEVFREQ in the kerneldoc comments with capitals is
> a good idea, it looks kind of odd.  I'm not sure what would be look better,
> though.

Umm.. what about "devfreq" using all non-capitals? It looks fine to me as well.

>
> On Friday, May 27, 2011, MyungJoo Ham wrote:
>
> ...
>> +/**
>> + * devfreq_monitor() - Regularly run devfreq_do() and support device DEVFREQ tickle.
>
> I'd say "periodically" istead of "regularly".

Yes, that is the proper term. Thanks.

>
>> + * @work: the work struct used to run devfreq_monitor periodically.
>
> Also please say something more about the "tickle" thinkg here.

Sure.

>
>> + */
>> +static void devfreq_monitor(struct work_struct *work)
>> +{
>> +     struct devfreq *devfreq;
>> +     int error;
>> +     bool continue_polling = false;
>> +     struct devfreq *to_remove = NULL;
>> +
>> +     mutex_lock(&devfreq_list_lock);
>> +
>> +     list_for_each_entry(devfreq, &devfreq_list, node) {
>> +             /* Remove the devfreq entry that failed */
>> +             if (to_remove) {
>> +                     list_del(&to_remove->node);
>> +                     kfree(to_remove);
>> +                     to_remove = NULL;
>> +             }
>> +
>> +             /*
>> +              * If the device is tickled and the tickle duration is left,
>> +              * do not change the frequency for a while
>> +              */
>> +             if (devfreq->tickle) {
>> +                     continue_polling = true;
>> +                     devfreq->tickle--;
>> +
>> +                     /*
>> +                      * If the tickle is ending and the device is not going
>> +                      * to poll, force the device to poll next time so that
>> +                      * it can return to the original frequency afterwards.
>> +                      * However, non-polling device will have 0 polling_ms,
>> +                      * it will not poll again later.
>> +                      */
>> +                     if (devfreq->tickle == 0 && devfreq->next_polling == 0)
>> +                             devfreq->next_polling = 1;
>> +
>> +                     continue;
>> +             }
>> +
>> +             /* This device does not require polling */
>> +             if (devfreq->next_polling == 0)
>> +                     continue;
>> +
>> +             continue_polling = true;
>> +
>> +             if (devfreq->next_polling == 1) {
>> +                     /* This device is polling this time */
>
> I'd remove this comment, it's confusing IMO.  Besides, it may be better
> to structure the code like this:
>
> if (devfreq->next_polling-- == 1) {
> }
>
> and then you wouldn't need the "else" at all.

Ok, I'll try that style.

>
>> +                     error = devfreq_do(devfreq);
>> +                     if (error && error != -EAGAIN) {
>> +                             /*
>> +                              * Remove a devfreq with error. However,
>> +                              * We cannot remove it right here because the
>
> Comma after "here", please.

"We" should be "we" here, but no comma there though:
http://owl.english.purdue.edu/owl/resource/607/02/

However, "However, because the ..... above, we cannot remove it right
here" is correct.

>
>> +                              * devfreq pointer is going to be used by
>> +                              * list_for_each_entry above. Thus, it is
>> +                              * removed afterwards.
>
> Why don't you use list_for_each_entry_safe(), then?

Ah.. that's a good idea. Thanks! Now, I can reorganize this one.

>
>> +                              */
>> +                             to_remove = devfreq->dev;
>> +                             dev_err(devfreq->dev, "devfreq_do error(%d). "
>> +                                     "DEVFREQ is removed from the device\n",
>> +                                     error);
>> +                             continue;
>> +                     }
>> +                     devfreq->next_polling = DIV_ROUND_UP(
>> +                                             devfreq->profile->polling_ms,
>> +                                             DEVFREQ_INTERVAL);
>> +             } else {
>> +                     /* The device will poll later when next_polling = 1 */
>> +                     devfreq->next_polling--;
>> +             }
>> +     }
>> +
>> +     if (to_remove) {
>> +             list_del(&to_remove->node);
>> +             kfree(to_remove);
>> +             to_remove = NULL;
>> +     }
>> +
>> +     if (continue_polling) {
>> +             polling = true;
>> +             queue_delayed_work(devfreq_wq, &devfreq_work,
>> +                                msecs_to_jiffies(DEVFREQ_INTERVAL));
>> +     } else {
>> +             polling = false;
>> +     }
>
> OK, so why exactly continue_polling is needed?  It seems you might siply use
> "polling" instead of it above.

The purpose of continue_polling seems to disappear in the middle of
development. I'll remove it.

>
>> +
>> +     mutex_unlock(&devfreq_list_lock);
>> +}
> ...
>> +/**
>> + * devfreq_update() - Notify that the device OPP has been changed.
>> + * @dev:     the device whose OPP has been changed.
>> + * @may_not_exist:   do not print error message even if the device
>> + *                   does not have devfreq entry.
>
> This argument isn't used any more.

Ah.. yes. sure.

>
>> + */
>> +int devfreq_update(struct device *dev)
>> +{
>> +     struct devfreq *devfreq;
>> +     int err = 0;
>> +
>> +     mutex_lock(&devfreq_list_lock);
>> +
>> +     devfreq = find_device_devfreq(dev);
>> +     if (IS_ERR(devfreq)) {
>> +             err = PTR_ERR(devfreq);
>> +             goto out;
>> +     }
>> +
>> +     if (devfreq->tickle) {
>> +             /* If the max freq available is changed, re-tickle */
>
> It would be good to explain what "re-tickle" means.

Of course. I'll explain that.

>
>> +             unsigned long freq = devfreq->profile->max_freq;
>> +             struct opp *opp = opp_find_freq_floor(devfreq->dev, &freq);
>> +
>> +             if (IS_ERR(opp)) {
>> +                     err = PTR_ERR(opp);
>> +                     goto out;
>> +             }
>> +
>> +             /* Max freq available is not changed */
>> +             if (devfreq->previous_freq == freq)
>> +                     goto out;
>> +
>> +             err = devfreq->profile->target(devfreq->dev, opp);
>> +             if (!err)
>> +                     devfreq->previous_freq = freq;
>
> Why don't we run devfreq_do() in this case?

When the list of available frequencies is updated and the device is to
be kept at its maximum frequency, we do not need to reevaluate the
device's activities to setup the new frequency. In such cases, we only
need to keep it at its maximum freuqency. If the maximum frequency is
not changed, we don't need anything to do (when "tickling" is
completed, devfreq_do() will reevaluate automatically); otherwise,
tickling again at the new frequency is needed (re-tickle).

>
>> +     } else {
>> +             /* Reevaluate the proper frequency */
>> +             err = devfreq_do(devfreq);
>> +     }
>> +
>> +out:
>> +     mutex_unlock(&devfreq_list_lock);
>> +     return err;
>> +}
>> +
>
> Add a kerneldoc comment here, please.

Yes. I'll add one. (and for any other functions missing kerneldoc comments)

>
>> +static int _devfreq_tickle_device(struct devfreq *df, unsigned long delay)
>> +{
>> +     int err = 0;
>> +     unsigned long freq;
>> +     struct opp *opp;
>> +
>> +     freq = df->profile->max_freq;
>> +     opp = opp_find_freq_floor(df->dev, &freq);
>> +     if (IS_ERR(opp))
>> +             return PTR_ERR(opp);
>> +
>> +     if (df->previous_freq != freq) {
>> +             err = df->profile->target(df->dev, opp);
>> +             if (!err)
>> +                     df->previous_freq = freq;
>> +     }
>> +     if (err) {
>> +             dev_err(df->dev, "%s: Cannot set frequency.\n", __func__);
>> +     } else {
>> +             df->tickle = delay;
>> +             df->num_tickle++;
>> +     }
>> +
>> +     if (devfreq_wq && !polling) {
>> +             polling = true;
>> +             queue_delayed_work(devfreq_wq, &devfreq_work,
>> +                             msecs_to_jiffies(DEVFREQ_INTERVAL));
>> +     }
>> +
>> +     return err;
>> +}
>
> Thanks,
> Rafael
>

Thank you so much for the comments!

Cheers!

- MyungJoo

-- 
MyungJoo Ham (함명주), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
  2011-07-02 21:56 ` Rafael J. Wysocki
  2011-07-04  8:34   ` MyungJoo Ham
@ 2011-07-04  8:34   ` MyungJoo Ham
  1 sibling, 0 replies; 35+ messages in thread
From: MyungJoo Ham @ 2011-07-04  8:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	linux-pm, Thomas Gleixner

2011/7/3 Rafael J. Wysocki <rjw@sisk.pl>:
> Hi,
>
> I'm not sure if spelling DEVFREQ in the kerneldoc comments with capitals is
> a good idea, it looks kind of odd.  I'm not sure what would be look better,
> though.

Umm.. what about "devfreq" using all non-capitals? It looks fine to me as well.

>
> On Friday, May 27, 2011, MyungJoo Ham wrote:
>
> ...
>> +/**
>> + * devfreq_monitor() - Regularly run devfreq_do() and support device DEVFREQ tickle.
>
> I'd say "periodically" istead of "regularly".

Yes, that is the proper term. Thanks.

>
>> + * @work: the work struct used to run devfreq_monitor periodically.
>
> Also please say something more about the "tickle" thinkg here.

Sure.

>
>> + */
>> +static void devfreq_monitor(struct work_struct *work)
>> +{
>> +     struct devfreq *devfreq;
>> +     int error;
>> +     bool continue_polling = false;
>> +     struct devfreq *to_remove = NULL;
>> +
>> +     mutex_lock(&devfreq_list_lock);
>> +
>> +     list_for_each_entry(devfreq, &devfreq_list, node) {
>> +             /* Remove the devfreq entry that failed */
>> +             if (to_remove) {
>> +                     list_del(&to_remove->node);
>> +                     kfree(to_remove);
>> +                     to_remove = NULL;
>> +             }
>> +
>> +             /*
>> +              * If the device is tickled and the tickle duration is left,
>> +              * do not change the frequency for a while
>> +              */
>> +             if (devfreq->tickle) {
>> +                     continue_polling = true;
>> +                     devfreq->tickle--;
>> +
>> +                     /*
>> +                      * If the tickle is ending and the device is not going
>> +                      * to poll, force the device to poll next time so that
>> +                      * it can return to the original frequency afterwards.
>> +                      * However, non-polling device will have 0 polling_ms,
>> +                      * it will not poll again later.
>> +                      */
>> +                     if (devfreq->tickle == 0 && devfreq->next_polling == 0)
>> +                             devfreq->next_polling = 1;
>> +
>> +                     continue;
>> +             }
>> +
>> +             /* This device does not require polling */
>> +             if (devfreq->next_polling == 0)
>> +                     continue;
>> +
>> +             continue_polling = true;
>> +
>> +             if (devfreq->next_polling == 1) {
>> +                     /* This device is polling this time */
>
> I'd remove this comment, it's confusing IMO.  Besides, it may be better
> to structure the code like this:
>
> if (devfreq->next_polling-- == 1) {
> }
>
> and then you wouldn't need the "else" at all.

Ok, I'll try that style.

>
>> +                     error = devfreq_do(devfreq);
>> +                     if (error && error != -EAGAIN) {
>> +                             /*
>> +                              * Remove a devfreq with error. However,
>> +                              * We cannot remove it right here because the
>
> Comma after "here", please.

"We" should be "we" here, but no comma there though:
http://owl.english.purdue.edu/owl/resource/607/02/

However, "However, because the ..... above, we cannot remove it right
here" is correct.

>
>> +                              * devfreq pointer is going to be used by
>> +                              * list_for_each_entry above. Thus, it is
>> +                              * removed afterwards.
>
> Why don't you use list_for_each_entry_safe(), then?

Ah.. that's a good idea. Thanks! Now, I can reorganize this one.

>
>> +                              */
>> +                             to_remove = devfreq->dev;
>> +                             dev_err(devfreq->dev, "devfreq_do error(%d). "
>> +                                     "DEVFREQ is removed from the device\n",
>> +                                     error);
>> +                             continue;
>> +                     }
>> +                     devfreq->next_polling = DIV_ROUND_UP(
>> +                                             devfreq->profile->polling_ms,
>> +                                             DEVFREQ_INTERVAL);
>> +             } else {
>> +                     /* The device will poll later when next_polling = 1 */
>> +                     devfreq->next_polling--;
>> +             }
>> +     }
>> +
>> +     if (to_remove) {
>> +             list_del(&to_remove->node);
>> +             kfree(to_remove);
>> +             to_remove = NULL;
>> +     }
>> +
>> +     if (continue_polling) {
>> +             polling = true;
>> +             queue_delayed_work(devfreq_wq, &devfreq_work,
>> +                                msecs_to_jiffies(DEVFREQ_INTERVAL));
>> +     } else {
>> +             polling = false;
>> +     }
>
> OK, so why exactly continue_polling is needed?  It seems you might siply use
> "polling" instead of it above.

The purpose of continue_polling seems to disappear in the middle of
development. I'll remove it.

>
>> +
>> +     mutex_unlock(&devfreq_list_lock);
>> +}
> ...
>> +/**
>> + * devfreq_update() - Notify that the device OPP has been changed.
>> + * @dev:     the device whose OPP has been changed.
>> + * @may_not_exist:   do not print error message even if the device
>> + *                   does not have devfreq entry.
>
> This argument isn't used any more.

Ah.. yes. sure.

>
>> + */
>> +int devfreq_update(struct device *dev)
>> +{
>> +     struct devfreq *devfreq;
>> +     int err = 0;
>> +
>> +     mutex_lock(&devfreq_list_lock);
>> +
>> +     devfreq = find_device_devfreq(dev);
>> +     if (IS_ERR(devfreq)) {
>> +             err = PTR_ERR(devfreq);
>> +             goto out;
>> +     }
>> +
>> +     if (devfreq->tickle) {
>> +             /* If the max freq available is changed, re-tickle */
>
> It would be good to explain what "re-tickle" means.

Of course. I'll explain that.

>
>> +             unsigned long freq = devfreq->profile->max_freq;
>> +             struct opp *opp = opp_find_freq_floor(devfreq->dev, &freq);
>> +
>> +             if (IS_ERR(opp)) {
>> +                     err = PTR_ERR(opp);
>> +                     goto out;
>> +             }
>> +
>> +             /* Max freq available is not changed */
>> +             if (devfreq->previous_freq == freq)
>> +                     goto out;
>> +
>> +             err = devfreq->profile->target(devfreq->dev, opp);
>> +             if (!err)
>> +                     devfreq->previous_freq = freq;
>
> Why don't we run devfreq_do() in this case?

When the list of available frequencies is updated and the device is to
be kept at its maximum frequency, we do not need to reevaluate the
device's activities to setup the new frequency. In such cases, we only
need to keep it at its maximum freuqency. If the maximum frequency is
not changed, we don't need anything to do (when "tickling" is
completed, devfreq_do() will reevaluate automatically); otherwise,
tickling again at the new frequency is needed (re-tickle).

>
>> +     } else {
>> +             /* Reevaluate the proper frequency */
>> +             err = devfreq_do(devfreq);
>> +     }
>> +
>> +out:
>> +     mutex_unlock(&devfreq_list_lock);
>> +     return err;
>> +}
>> +
>
> Add a kerneldoc comment here, please.

Yes. I'll add one. (and for any other functions missing kerneldoc comments)

>
>> +static int _devfreq_tickle_device(struct devfreq *df, unsigned long delay)
>> +{
>> +     int err = 0;
>> +     unsigned long freq;
>> +     struct opp *opp;
>> +
>> +     freq = df->profile->max_freq;
>> +     opp = opp_find_freq_floor(df->dev, &freq);
>> +     if (IS_ERR(opp))
>> +             return PTR_ERR(opp);
>> +
>> +     if (df->previous_freq != freq) {
>> +             err = df->profile->target(df->dev, opp);
>> +             if (!err)
>> +                     df->previous_freq = freq;
>> +     }
>> +     if (err) {
>> +             dev_err(df->dev, "%s: Cannot set frequency.\n", __func__);
>> +     } else {
>> +             df->tickle = delay;
>> +             df->num_tickle++;
>> +     }
>> +
>> +     if (devfreq_wq && !polling) {
>> +             polling = true;
>> +             queue_delayed_work(devfreq_wq, &devfreq_work,
>> +                             msecs_to_jiffies(DEVFREQ_INTERVAL));
>> +     }
>> +
>> +     return err;
>> +}
>
> Thanks,
> Rafael
>

Thank you so much for the comments!

Cheers!

- MyungJoo

-- 
MyungJoo Ham (함명주), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/linux-pm

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 2/3] PM / DEVFREQ: add example governors
  2011-07-04  1:37       ` MyungJoo Ham
  (?)
  (?)
@ 2011-07-04  8:43       ` Rafael J. Wysocki
  2011-07-04  8:58         ` MyungJoo Ham
  2011-07-04  8:58         ` MyungJoo Ham
  -1 siblings, 2 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-07-04  8:43 UTC (permalink / raw)
  To: myungjoo.ham
  Cc: linux-pm, linux-kernel, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kyungmin Park, Jiejing Zhang, Colin Cross,
	Nishanth Menon, Thomas Gleixner

On Monday, July 04, 2011, MyungJoo Ham wrote:
> Hello,
> 
> 2011/7/3 Rafael J. Wysocki <rjw@sisk.pl>:
> > Hi,
> >
> > On Friday, May 27, 2011, MyungJoo Ham wrote:
> >> Three CPUFREQ-like governors are provided as examples.
> >>
> >> powersave: use the lowest frequency possible. The user (device) should
> >> set the polling_ms as 0 because polling is useless for this governor.
> >>
> >> performance: use the highest freqeuncy possible. The user (device)
> >> should set the polling_ms as 0 because polling is useless for this
> >> governor.
> >>
> >> simple_ondemand: simplified version of CPUFREQ's ONDEMAND governor.
> >>
> >> When a user updates OPP entries (enable/disable/add), OPP framework
> >> automatically notifies DEVFREQ to update operating frequency
> >> accordingly. Thus, DEVFREQ users (device drivers) do not need to update
> >> DEVFREQ manually with OPP entry updates or set polling_ms for powersave
> >> , performance, or any other "static" governors.
> >>
> >> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
> >> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> []
> >> +
> >> +     /* Set the desired frequency based on the load */
> >> +     a = (unsigned long long) stat.busy_time * stat.current_frequency;
> >
> > What's the purpose of the conversion?
> 
> Assuming that the work speed of a device is proportional to its
> frequency, it measures the amount of work done.
> It's time * work/time. For example, during the last 10 second, if the
> busy_time was 5 sec and frequency was 10MHz,
> it's "50M", which is same as 20MHz and 2.5 sec.

I understand that, but my question was why you're doing a forced conversion
to (unsigned long long).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 2/3] PM / DEVFREQ: add example governors
  2011-07-04  1:37       ` MyungJoo Ham
  (?)
@ 2011-07-04  8:43       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-07-04  8:43 UTC (permalink / raw)
  To: myungjoo.ham
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	linux-pm, Thomas Gleixner

On Monday, July 04, 2011, MyungJoo Ham wrote:
> Hello,
> 
> 2011/7/3 Rafael J. Wysocki <rjw@sisk.pl>:
> > Hi,
> >
> > On Friday, May 27, 2011, MyungJoo Ham wrote:
> >> Three CPUFREQ-like governors are provided as examples.
> >>
> >> powersave: use the lowest frequency possible. The user (device) should
> >> set the polling_ms as 0 because polling is useless for this governor.
> >>
> >> performance: use the highest freqeuncy possible. The user (device)
> >> should set the polling_ms as 0 because polling is useless for this
> >> governor.
> >>
> >> simple_ondemand: simplified version of CPUFREQ's ONDEMAND governor.
> >>
> >> When a user updates OPP entries (enable/disable/add), OPP framework
> >> automatically notifies DEVFREQ to update operating frequency
> >> accordingly. Thus, DEVFREQ users (device drivers) do not need to update
> >> DEVFREQ manually with OPP entry updates or set polling_ms for powersave
> >> , performance, or any other "static" governors.
> >>
> >> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
> >> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> []
> >> +
> >> +     /* Set the desired frequency based on the load */
> >> +     a = (unsigned long long) stat.busy_time * stat.current_frequency;
> >
> > What's the purpose of the conversion?
> 
> Assuming that the work speed of a device is proportional to its
> frequency, it measures the amount of work done.
> It's time * work/time. For example, during the last 10 second, if the
> busy_time was 5 sec and frequency was 10MHz,
> it's "50M", which is same as 20MHz and 2.5 sec.

I understand that, but my question was why you're doing a forced conversion
to (unsigned long long).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 3/3] PM / DEVFREQ: add sysfs interface (including user tickling)
  2011-07-04  7:15     ` MyungJoo Ham
  2011-07-04  8:48       ` Rafael J. Wysocki
@ 2011-07-04  8:48       ` Rafael J. Wysocki
  1 sibling, 0 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-07-04  8:48 UTC (permalink / raw)
  To: myungjoo.ham
  Cc: linux-pm, linux-kernel, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kyungmin Park, Jiejing Zhang, Colin Cross,
	Nishanth Menon, Thomas Gleixner

On Monday, July 04, 2011, MyungJoo Ham wrote:
> Hello,
> 
> 2011/7/3 Rafael J. Wysocki <rjw@sisk.pl>:
> > On Friday, May 27, 2011, MyungJoo Ham wrote:
> >> 1. System-wide sysfs interface
> >> - tickle_all  R: number of tickle_all execution
> >>               W: tickle all devfreq devices
> >> - min_interval        R: devfreq monitoring base interval in ms
> >> - monitoring  R: shows whether devfreq monitoring is active or
> >>   not.
> >>
> >> 2. Device specific sysfs interface
> >> - tickle      R: number of tickle execution for the device
> >>               W: tickle the device
> >>
> >> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
> >> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> >>
> >> --
> >> Changed from v2
> >> - add ABI entries for devfreq sysfs interface
> >> ---
> >>  Documentation/ABI/testing/sysfs-devices-devfreq |   21 ++++
> >>  Documentation/ABI/testing/sysfs-power           |   43 ++++++++
> >>  drivers/base/power/devfreq.c                    |  133 ++++++++++++++++++++++-
> >>  include/linux/devfreq.h                         |    3 +
> >>  4 files changed, 199 insertions(+), 1 deletions(-)
> >>  create mode 100644 Documentation/ABI/testing/sysfs-devices-devfreq
> >>
> >> diff --git a/Documentation/ABI/testing/sysfs-devices-devfreq b/Documentation/ABI/testing/sysfs-devices-devfreq
> >> new file mode 100644
> >> index 0000000..7f35a64
> >> --- /dev/null
> >> +++ b/Documentation/ABI/testing/sysfs-devices-devfreq
> >> @@ -0,0 +1,21 @@
> >> +What:                /sys/devices/.../devfreq/
> >> +Date:                May 2011
> >> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
> >> +Description:
> >> +             The /sys/device/.../devfreq directory will contain files
> >> +             that provide interfaces to DEVFREQ for a specific device.
> >> +
> >> +What:                /sys/devices/.../devfreq/tickle
> >> +Date:                May 2011
> >> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
> >> +Description:
> >> +             The /sys/devices/.../devfreq/tickle file allows user space
> >> +             to force the corresponding device to operate at its maximum
> >> +             operable frequency instaneously and temporarily. After a
> >> +             designated duration has passed, the operating frequency returns
> >> +             to normal. When a user reads the tickle entry, it returns
> >> +             the number of tickle executions for the device. When a user
> >> +             writes to the tickle entry with the tickle duration in ms,
> >> +             the effect of device tickling is held for the designated
> >> +             duration. Note that the duration is rounded-up by
> >> +             the value DEVFREQ_INTERVAL defined in devfreq.c
> >> diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power
> >> index b464d12..4d8434b 100644
> >> --- a/Documentation/ABI/testing/sysfs-power
> >> +++ b/Documentation/ABI/testing/sysfs-power
> >> @@ -172,3 +172,46 @@ Description:
> >>
> >>               Reading from this file will display the current value, which is
> >>               set to 1 MB by default.
> >> +
> >> +What:                /sys/power/devfreq/
> >> +Date:                May 2011
> >> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
> >> +Description:
> >> +             The /sys/power/devfreq directory will contain files that will
> >> +             provide a unified interface to the DEVFREQ, a generic DVFS
> >> +             (dynamic voltage and frequency scaling) framework.
> >> +
> >> +What:                /sys/power/devfreq/tickle_all
> >> +Date:                May 2011
> >> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
> >> +Description:
> >> +             The /sys/power/devfreq/tickle_all file allows user space to
> >> +             force every device with DEVFREQ to operate at the maximum
> >> +             frequency of the device instaneously and temporarily. After
> >> +             a designated delay has passed, the operating frequency returns
> >> +             to normal. If a user reads the tickle_all entry, it returns
> >> +             the number of tickle_all executions. When writing to the
> >> +             tickle_all entry, the user should supply with the duration of
> >> +             tickle in ms (the "designated delay" mentioned before). Then,
> >> +             the effect of tickle_all will hold for the denoted duration.
> >> +             Note that the duration is rounded by the monitoring period
> >> +             defined by DEVFREQ_INTERVAL in /drivers/base/power/devfreq.c.
> >> +
> >> +What:                /sys/power/devfreq/min_interval
> >> +Date:                May 2011
> >> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
> >> +Description:
> >> +             The /sys/power/devfreq/min_interval file shows the monitoring
> >> +             period defined by DEVFREQ_INTERVAL in
> >> +             /drivers/base/power/devfreq.c. The duration of device tickling
> >> +             is rounded-up by DEVFREQ_INTERVAL.
> >> +
> >> +What:                /sys/power/devfreq/monitoring
> >> +Date:                May 2011
> >> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
> >> +Description:
> >> +             The /sys/power/devfreq/monitoring file shows whether DEVFREQ
> >> +             is periodically monitoring. Periodic monitoring is activated
> >> +             if there is a device that wants periodic monitoring for DVFS or
> >> +             there is a device that is tickled (and the tickling duration is
> >> +             not yet expired).
> >> diff --git a/drivers/base/power/devfreq.c b/drivers/base/power/devfreq.c
> >> index 7648a94..709c138 100644
> >> --- a/drivers/base/power/devfreq.c
> >> +++ b/drivers/base/power/devfreq.c
> >> @@ -40,6 +40,9 @@ static LIST_HEAD(devfreq_list);
> >>  /* Exclusive access to devfreq_list and its elements */
> >>  static DEFINE_MUTEX(devfreq_list_lock);
> >>
> >> +static struct kobject *devfreq_kobj;
> >> +static struct attribute_group dev_attr_group;
> >> +
> >>  /**
> >>   * find_device_devfreq() - find devfreq struct using device pointer
> >>   * @dev:     device pointer used to lookup device DEVFREQ.
> >> @@ -237,6 +240,8 @@ int devfreq_add_device(struct device *dev, struct devfreq_dev_profile *profile,
> >>               queue_delayed_work(devfreq_wq, &devfreq_work,
> >>                                  msecs_to_jiffies(DEVFREQ_INTERVAL));
> >>       }
> >> +
> >> +     sysfs_update_group(&dev->kobj, &dev_attr_group);
> >
> > This appears to modify the attributes of the device, but anything like it is
> > not mentioned in the documentation.  What's up?
> 
> It is mentioned "Documentation/ABI/testing/sysfs-devices-devfreq".

Yes, it is, sorry.  But it should be under /sys/devices/.../power/ .

> >
> >>  out:
> >>       mutex_unlock(&devfreq_list_lock);
> >>
> >> @@ -263,6 +268,8 @@ int devfreq_remove_device(struct device *dev)
> >>               return -EINVAL;
> >>       }
> >>
> >> +     sysfs_remove_group(&dev->kobj, &dev_attr_group);
> >> +
> >>       list_del(&devfreq->node);
> >>
> >>       kfree(devfreq);
> >> @@ -344,7 +351,7 @@ static int _devfreq_tickle_device(struct devfreq *df, unsigned long delay)
> >>       if (devfreq_wq && !polling) {
> >>               polling = true;
> >>               queue_delayed_work(devfreq_wq, &devfreq_work,
> >> -                             msecs_to_jiffies(DEVFREQ_INTERVAL));
> >> +                                msecs_to_jiffies(DEVFREQ_INTERVAL));
> >
> > This change doesn't seem to belong to this patch.
> 
> Oops. I'll rearrange the patch series.
> 
> >
> >>       }
> >>
> >>       return err;
> >> @@ -382,6 +389,116 @@ int devfreq_tickle_device(struct device *dev, unsigned long duration_ms)
> >>       return err;
> >>  }
> >>
> >> +static int num_tickle_all;
> >> +static ssize_t tickle_all(struct device *dev, struct device_attribute *attr,
> >> +                       const char *buf, size_t count)
> >> +{
> >> +     int duration = 0;
> >> +     struct devfreq *tmp;
> >> +     unsigned long delay;
> >> +
> >> +     sscanf(buf, "%d", &duration);
> >> +     if (duration < DEVFREQ_INTERVAL)
> >> +             duration = DEVFREQ_INTERVAL;
> >> +
> >> +     if (unlikely(IS_ERR_OR_NULL(dev))) {
> >> +             pr_err("%s: Invalid parameters\n", __func__);
> >
> > Please say "null device" instead of in addition to the above.
> 
> Ok. I'll let it say "Null or invalid device" as it's IS_ERR_OR_NULL.

OK

> >
> >> +             return -EINVAL;
> >> +     }
> >> +
> >> +     delay = DIV_ROUND_UP(duration, DEVFREQ_INTERVAL);
> >> +
> >> +     mutex_lock(&devfreq_list_lock);
> >> +     list_for_each_entry(tmp, &devfreq_list, node) {
> >> +             _devfreq_tickle_device(tmp, delay);
> >> +     }
> >> +     mutex_unlock(&devfreq_list_lock);
> >> +
> >> +     num_tickle_all++;
> >> +     return count;
> >> +}
> >> +
> >> +static ssize_t show_num_tickle_all(struct device *dev,
> >> +                                struct device_attribute *attr, char *buf)
> >> +{
> >> +     return sprintf(buf, "%d\n", num_tickle_all);
> >> +}
> >> +
> >> +static ssize_t show_min_interval(struct device *dev,
> >> +                              struct device_attribute *attr, char *buf)
> >> +{
> >> +     return sprintf(buf, "%d\n", DEVFREQ_INTERVAL);
> >> +}
> >> +
> >> +static ssize_t show_monitoring(struct device *dev,
> >> +                            struct device_attribute *attr, char *buf)
> >> +{
> >> +     return sprintf(buf, "%d\n", monitoring ? 1 : 0);
> >> +}
> >> +
> >> +static DEVICE_ATTR(tickle_all, 0644, show_num_tickle_all, tickle_all);
> >> +static DEVICE_ATTR(min_interval, 0444, show_min_interval, NULL);
> >> +static DEVICE_ATTR(monitoring, 0444, show_monitoring, NULL);
> >> +static struct attribute *devfreq_entries[] = {
> >> +     &dev_attr_tickle_all.attr,
> >> +     &dev_attr_min_interval.attr,
> >> +     &dev_attr_monitoring.attr,
> >> +     NULL,
> >> +};
> >> +static struct attribute_group devfreq_attr_group = {
> >> +     .name   = NULL,
> >> +     .attrs  = devfreq_entries,
> >> +};
> >> +
> >> +static ssize_t tickle(struct device *dev, struct device_attribute *attr,
> >> +                   const char *buf, size_t count)
> >> +{
> >> +     int duration;
> >> +     struct devfreq *df;
> >> +     unsigned long delay;
> >> +
> >> +     sscanf(buf, "%d", &duration);
> >> +     if (duration < DEVFREQ_INTERVAL)
> >> +             duration = DEVFREQ_INTERVAL;
> >> +
> >> +     if (unlikely(IS_ERR_OR_NULL(dev))) {
> >> +             pr_err("%s: Invalid parameters\n", __func__);
> >
> > Like above.
> 
> Yup.
> 
> >
> >> +             return -EINVAL;
> >> +     }
> >> +
> >> +     delay = DIV_ROUND_UP(duration, DEVFREQ_INTERVAL);
> >> +
> >> +     mutex_lock(&devfreq_list_lock);
> >> +     df = find_device_devfreq(dev);
> >> +     _devfreq_tickle_device(df, delay);
> >> +     mutex_unlock(&devfreq_list_lock);
> >> +
> >> +     return count;
> >> +}
> >> +
> >> +static ssize_t show_num_tickle(struct device *dev,
> >> +                            struct device_attribute *attr, char *buf)
> >> +{
> >> +     struct devfreq *df;
> >> +
> >> +     df = find_device_devfreq(dev);
> >> +
> >> +     if (!IS_ERR(df))
> >> +             return sprintf(buf, "%d\n", df->num_tickle);
> >> +
> >> +     return PTR_ERR(df);
> >> +}
> >> +
> >> +static DEVICE_ATTR(tickle, 0644, show_num_tickle, tickle);
> >> +static struct attribute *dev_entries[] = {
> >> +     &dev_attr_tickle.attr,
> >> +     NULL,
> >> +};
> >> +static struct attribute_group dev_attr_group = {
> >> +     .name   = "devfreq",
> >> +     .attrs  = dev_entries,
> >> +};
> >> +
> >>  static int __init devfreq_init(void)
> >>  {
> >>       mutex_lock(&devfreq_list_lock);
> >> @@ -389,6 +506,20 @@ static int __init devfreq_init(void)
> >>       polling = false;
> >>       devfreq_wq = create_freezable_workqueue("devfreq_wq");
> >>       INIT_DELAYED_WORK_DEFERRABLE(&devfreq_work, devfreq_monitor);
> >> +
> >> +#ifdef CONFIG_PM
> >> +     /* Create sysfs */
> >> +     devfreq_kobj = kobject_create_and_add("devfreq", power_kobj);
> >
> > Hmm, so power_kobj is global?  It shouldn't be.
> 
> Yes, it is global and it's declared at include/linux/kobject.h.

Oh, well.

> >
> > Generally, whatever adds attributes to /sys/power should be located in
> > kernel/power/ .
> 
> I've put it at /sys/power because these attributes are global to every
> device with devfreq.

That's fine.

What I wanted to say was that all code adding attributes into /sys/power
should be located under kernel/power/ in the kernel source tree.

Otherwise we'll end up with a mess of dependencies that's hard to maintain.

> Anyway, if you think that's a weird location for it, could you please
> give me some hints on where would be proper to locate system-wide
> devfreq sysfs attributes?
> Or, do you think kernel/power/devfreq.c would be a proper location for devfreq?
> 
> Or, what about allowing every /sys/devices/.../devfreq/global/* to be
> the same "global" attributes of devfreq?
> 
> >
> >> +     if (!devfreq_kobj) {
> >> +             pr_err("Unable to create DEVFREQ kobject.\n");
> >> +             goto out;
> >> +     }
> >> +     if (sysfs_create_group(devfreq_kobj, &devfreq_attr_group)) {
> >> +             pr_err("Unable to create DEVFREQ sysfs entries.\n");
> >> +             goto out;
> >> +     }
> >> +#endif
> >> +out:
> >>       mutex_unlock(&devfreq_list_lock);
> >>
> >>       devfreq_monitor(&devfreq_work.work);
> >> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> >> index 1ec9a40..69334e7 100644
> >> --- a/include/linux/devfreq.h
> >> +++ b/include/linux/devfreq.h
> >> @@ -59,6 +59,7 @@ struct devfreq_governor {
> >>   *           at each executino of devfreq_monitor, tickle is decremented.
> >>   *           User may tickle a device-devfreq in order to set maximum
> >>   *           frequency instaneously with some guaranteed duration.
> >> + * @num_tickle       number of tickle calls.
> >>   *
> >>   * This structure stores the DEVFREQ information for a give device.
> >>   */
> >> @@ -72,6 +73,8 @@ struct devfreq {
> >>       unsigned long previous_freq;
> >>       unsigned int next_polling;
> >>       unsigned int tickle;
> >> +
> >> +     unsigned int num_tickle;
> >>  };
> >>
> >>  #if defined(CONFIG_PM_DEVFREQ)
> >
> > It looks like the above two changes should be moved to a separate patch.
> 
> These two changes are for the sysfs interface as counting number of
> tickle is added with sysfs interface.

OK

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 3/3] PM / DEVFREQ: add sysfs interface (including user tickling)
  2011-07-04  7:15     ` MyungJoo Ham
@ 2011-07-04  8:48       ` Rafael J. Wysocki
  2011-07-04  8:48       ` Rafael J. Wysocki
  1 sibling, 0 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-07-04  8:48 UTC (permalink / raw)
  To: myungjoo.ham
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	linux-pm, Thomas Gleixner

On Monday, July 04, 2011, MyungJoo Ham wrote:
> Hello,
> 
> 2011/7/3 Rafael J. Wysocki <rjw@sisk.pl>:
> > On Friday, May 27, 2011, MyungJoo Ham wrote:
> >> 1. System-wide sysfs interface
> >> - tickle_all  R: number of tickle_all execution
> >>               W: tickle all devfreq devices
> >> - min_interval        R: devfreq monitoring base interval in ms
> >> - monitoring  R: shows whether devfreq monitoring is active or
> >>   not.
> >>
> >> 2. Device specific sysfs interface
> >> - tickle      R: number of tickle execution for the device
> >>               W: tickle the device
> >>
> >> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
> >> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> >>
> >> --
> >> Changed from v2
> >> - add ABI entries for devfreq sysfs interface
> >> ---
> >>  Documentation/ABI/testing/sysfs-devices-devfreq |   21 ++++
> >>  Documentation/ABI/testing/sysfs-power           |   43 ++++++++
> >>  drivers/base/power/devfreq.c                    |  133 ++++++++++++++++++++++-
> >>  include/linux/devfreq.h                         |    3 +
> >>  4 files changed, 199 insertions(+), 1 deletions(-)
> >>  create mode 100644 Documentation/ABI/testing/sysfs-devices-devfreq
> >>
> >> diff --git a/Documentation/ABI/testing/sysfs-devices-devfreq b/Documentation/ABI/testing/sysfs-devices-devfreq
> >> new file mode 100644
> >> index 0000000..7f35a64
> >> --- /dev/null
> >> +++ b/Documentation/ABI/testing/sysfs-devices-devfreq
> >> @@ -0,0 +1,21 @@
> >> +What:                /sys/devices/.../devfreq/
> >> +Date:                May 2011
> >> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
> >> +Description:
> >> +             The /sys/device/.../devfreq directory will contain files
> >> +             that provide interfaces to DEVFREQ for a specific device.
> >> +
> >> +What:                /sys/devices/.../devfreq/tickle
> >> +Date:                May 2011
> >> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
> >> +Description:
> >> +             The /sys/devices/.../devfreq/tickle file allows user space
> >> +             to force the corresponding device to operate at its maximum
> >> +             operable frequency instaneously and temporarily. After a
> >> +             designated duration has passed, the operating frequency returns
> >> +             to normal. When a user reads the tickle entry, it returns
> >> +             the number of tickle executions for the device. When a user
> >> +             writes to the tickle entry with the tickle duration in ms,
> >> +             the effect of device tickling is held for the designated
> >> +             duration. Note that the duration is rounded-up by
> >> +             the value DEVFREQ_INTERVAL defined in devfreq.c
> >> diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power
> >> index b464d12..4d8434b 100644
> >> --- a/Documentation/ABI/testing/sysfs-power
> >> +++ b/Documentation/ABI/testing/sysfs-power
> >> @@ -172,3 +172,46 @@ Description:
> >>
> >>               Reading from this file will display the current value, which is
> >>               set to 1 MB by default.
> >> +
> >> +What:                /sys/power/devfreq/
> >> +Date:                May 2011
> >> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
> >> +Description:
> >> +             The /sys/power/devfreq directory will contain files that will
> >> +             provide a unified interface to the DEVFREQ, a generic DVFS
> >> +             (dynamic voltage and frequency scaling) framework.
> >> +
> >> +What:                /sys/power/devfreq/tickle_all
> >> +Date:                May 2011
> >> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
> >> +Description:
> >> +             The /sys/power/devfreq/tickle_all file allows user space to
> >> +             force every device with DEVFREQ to operate at the maximum
> >> +             frequency of the device instaneously and temporarily. After
> >> +             a designated delay has passed, the operating frequency returns
> >> +             to normal. If a user reads the tickle_all entry, it returns
> >> +             the number of tickle_all executions. When writing to the
> >> +             tickle_all entry, the user should supply with the duration of
> >> +             tickle in ms (the "designated delay" mentioned before). Then,
> >> +             the effect of tickle_all will hold for the denoted duration.
> >> +             Note that the duration is rounded by the monitoring period
> >> +             defined by DEVFREQ_INTERVAL in /drivers/base/power/devfreq.c.
> >> +
> >> +What:                /sys/power/devfreq/min_interval
> >> +Date:                May 2011
> >> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
> >> +Description:
> >> +             The /sys/power/devfreq/min_interval file shows the monitoring
> >> +             period defined by DEVFREQ_INTERVAL in
> >> +             /drivers/base/power/devfreq.c. The duration of device tickling
> >> +             is rounded-up by DEVFREQ_INTERVAL.
> >> +
> >> +What:                /sys/power/devfreq/monitoring
> >> +Date:                May 2011
> >> +Contact:     MyungJoo Ham <myungjoo.ham@samsung.com>
> >> +Description:
> >> +             The /sys/power/devfreq/monitoring file shows whether DEVFREQ
> >> +             is periodically monitoring. Periodic monitoring is activated
> >> +             if there is a device that wants periodic monitoring for DVFS or
> >> +             there is a device that is tickled (and the tickling duration is
> >> +             not yet expired).
> >> diff --git a/drivers/base/power/devfreq.c b/drivers/base/power/devfreq.c
> >> index 7648a94..709c138 100644
> >> --- a/drivers/base/power/devfreq.c
> >> +++ b/drivers/base/power/devfreq.c
> >> @@ -40,6 +40,9 @@ static LIST_HEAD(devfreq_list);
> >>  /* Exclusive access to devfreq_list and its elements */
> >>  static DEFINE_MUTEX(devfreq_list_lock);
> >>
> >> +static struct kobject *devfreq_kobj;
> >> +static struct attribute_group dev_attr_group;
> >> +
> >>  /**
> >>   * find_device_devfreq() - find devfreq struct using device pointer
> >>   * @dev:     device pointer used to lookup device DEVFREQ.
> >> @@ -237,6 +240,8 @@ int devfreq_add_device(struct device *dev, struct devfreq_dev_profile *profile,
> >>               queue_delayed_work(devfreq_wq, &devfreq_work,
> >>                                  msecs_to_jiffies(DEVFREQ_INTERVAL));
> >>       }
> >> +
> >> +     sysfs_update_group(&dev->kobj, &dev_attr_group);
> >
> > This appears to modify the attributes of the device, but anything like it is
> > not mentioned in the documentation.  What's up?
> 
> It is mentioned "Documentation/ABI/testing/sysfs-devices-devfreq".

Yes, it is, sorry.  But it should be under /sys/devices/.../power/ .

> >
> >>  out:
> >>       mutex_unlock(&devfreq_list_lock);
> >>
> >> @@ -263,6 +268,8 @@ int devfreq_remove_device(struct device *dev)
> >>               return -EINVAL;
> >>       }
> >>
> >> +     sysfs_remove_group(&dev->kobj, &dev_attr_group);
> >> +
> >>       list_del(&devfreq->node);
> >>
> >>       kfree(devfreq);
> >> @@ -344,7 +351,7 @@ static int _devfreq_tickle_device(struct devfreq *df, unsigned long delay)
> >>       if (devfreq_wq && !polling) {
> >>               polling = true;
> >>               queue_delayed_work(devfreq_wq, &devfreq_work,
> >> -                             msecs_to_jiffies(DEVFREQ_INTERVAL));
> >> +                                msecs_to_jiffies(DEVFREQ_INTERVAL));
> >
> > This change doesn't seem to belong to this patch.
> 
> Oops. I'll rearrange the patch series.
> 
> >
> >>       }
> >>
> >>       return err;
> >> @@ -382,6 +389,116 @@ int devfreq_tickle_device(struct device *dev, unsigned long duration_ms)
> >>       return err;
> >>  }
> >>
> >> +static int num_tickle_all;
> >> +static ssize_t tickle_all(struct device *dev, struct device_attribute *attr,
> >> +                       const char *buf, size_t count)
> >> +{
> >> +     int duration = 0;
> >> +     struct devfreq *tmp;
> >> +     unsigned long delay;
> >> +
> >> +     sscanf(buf, "%d", &duration);
> >> +     if (duration < DEVFREQ_INTERVAL)
> >> +             duration = DEVFREQ_INTERVAL;
> >> +
> >> +     if (unlikely(IS_ERR_OR_NULL(dev))) {
> >> +             pr_err("%s: Invalid parameters\n", __func__);
> >
> > Please say "null device" instead of in addition to the above.
> 
> Ok. I'll let it say "Null or invalid device" as it's IS_ERR_OR_NULL.

OK

> >
> >> +             return -EINVAL;
> >> +     }
> >> +
> >> +     delay = DIV_ROUND_UP(duration, DEVFREQ_INTERVAL);
> >> +
> >> +     mutex_lock(&devfreq_list_lock);
> >> +     list_for_each_entry(tmp, &devfreq_list, node) {
> >> +             _devfreq_tickle_device(tmp, delay);
> >> +     }
> >> +     mutex_unlock(&devfreq_list_lock);
> >> +
> >> +     num_tickle_all++;
> >> +     return count;
> >> +}
> >> +
> >> +static ssize_t show_num_tickle_all(struct device *dev,
> >> +                                struct device_attribute *attr, char *buf)
> >> +{
> >> +     return sprintf(buf, "%d\n", num_tickle_all);
> >> +}
> >> +
> >> +static ssize_t show_min_interval(struct device *dev,
> >> +                              struct device_attribute *attr, char *buf)
> >> +{
> >> +     return sprintf(buf, "%d\n", DEVFREQ_INTERVAL);
> >> +}
> >> +
> >> +static ssize_t show_monitoring(struct device *dev,
> >> +                            struct device_attribute *attr, char *buf)
> >> +{
> >> +     return sprintf(buf, "%d\n", monitoring ? 1 : 0);
> >> +}
> >> +
> >> +static DEVICE_ATTR(tickle_all, 0644, show_num_tickle_all, tickle_all);
> >> +static DEVICE_ATTR(min_interval, 0444, show_min_interval, NULL);
> >> +static DEVICE_ATTR(monitoring, 0444, show_monitoring, NULL);
> >> +static struct attribute *devfreq_entries[] = {
> >> +     &dev_attr_tickle_all.attr,
> >> +     &dev_attr_min_interval.attr,
> >> +     &dev_attr_monitoring.attr,
> >> +     NULL,
> >> +};
> >> +static struct attribute_group devfreq_attr_group = {
> >> +     .name   = NULL,
> >> +     .attrs  = devfreq_entries,
> >> +};
> >> +
> >> +static ssize_t tickle(struct device *dev, struct device_attribute *attr,
> >> +                   const char *buf, size_t count)
> >> +{
> >> +     int duration;
> >> +     struct devfreq *df;
> >> +     unsigned long delay;
> >> +
> >> +     sscanf(buf, "%d", &duration);
> >> +     if (duration < DEVFREQ_INTERVAL)
> >> +             duration = DEVFREQ_INTERVAL;
> >> +
> >> +     if (unlikely(IS_ERR_OR_NULL(dev))) {
> >> +             pr_err("%s: Invalid parameters\n", __func__);
> >
> > Like above.
> 
> Yup.
> 
> >
> >> +             return -EINVAL;
> >> +     }
> >> +
> >> +     delay = DIV_ROUND_UP(duration, DEVFREQ_INTERVAL);
> >> +
> >> +     mutex_lock(&devfreq_list_lock);
> >> +     df = find_device_devfreq(dev);
> >> +     _devfreq_tickle_device(df, delay);
> >> +     mutex_unlock(&devfreq_list_lock);
> >> +
> >> +     return count;
> >> +}
> >> +
> >> +static ssize_t show_num_tickle(struct device *dev,
> >> +                            struct device_attribute *attr, char *buf)
> >> +{
> >> +     struct devfreq *df;
> >> +
> >> +     df = find_device_devfreq(dev);
> >> +
> >> +     if (!IS_ERR(df))
> >> +             return sprintf(buf, "%d\n", df->num_tickle);
> >> +
> >> +     return PTR_ERR(df);
> >> +}
> >> +
> >> +static DEVICE_ATTR(tickle, 0644, show_num_tickle, tickle);
> >> +static struct attribute *dev_entries[] = {
> >> +     &dev_attr_tickle.attr,
> >> +     NULL,
> >> +};
> >> +static struct attribute_group dev_attr_group = {
> >> +     .name   = "devfreq",
> >> +     .attrs  = dev_entries,
> >> +};
> >> +
> >>  static int __init devfreq_init(void)
> >>  {
> >>       mutex_lock(&devfreq_list_lock);
> >> @@ -389,6 +506,20 @@ static int __init devfreq_init(void)
> >>       polling = false;
> >>       devfreq_wq = create_freezable_workqueue("devfreq_wq");
> >>       INIT_DELAYED_WORK_DEFERRABLE(&devfreq_work, devfreq_monitor);
> >> +
> >> +#ifdef CONFIG_PM
> >> +     /* Create sysfs */
> >> +     devfreq_kobj = kobject_create_and_add("devfreq", power_kobj);
> >
> > Hmm, so power_kobj is global?  It shouldn't be.
> 
> Yes, it is global and it's declared at include/linux/kobject.h.

Oh, well.

> >
> > Generally, whatever adds attributes to /sys/power should be located in
> > kernel/power/ .
> 
> I've put it at /sys/power because these attributes are global to every
> device with devfreq.

That's fine.

What I wanted to say was that all code adding attributes into /sys/power
should be located under kernel/power/ in the kernel source tree.

Otherwise we'll end up with a mess of dependencies that's hard to maintain.

> Anyway, if you think that's a weird location for it, could you please
> give me some hints on where would be proper to locate system-wide
> devfreq sysfs attributes?
> Or, do you think kernel/power/devfreq.c would be a proper location for devfreq?
> 
> Or, what about allowing every /sys/devices/.../devfreq/global/* to be
> the same "global" attributes of devfreq?
> 
> >
> >> +     if (!devfreq_kobj) {
> >> +             pr_err("Unable to create DEVFREQ kobject.\n");
> >> +             goto out;
> >> +     }
> >> +     if (sysfs_create_group(devfreq_kobj, &devfreq_attr_group)) {
> >> +             pr_err("Unable to create DEVFREQ sysfs entries.\n");
> >> +             goto out;
> >> +     }
> >> +#endif
> >> +out:
> >>       mutex_unlock(&devfreq_list_lock);
> >>
> >>       devfreq_monitor(&devfreq_work.work);
> >> diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h
> >> index 1ec9a40..69334e7 100644
> >> --- a/include/linux/devfreq.h
> >> +++ b/include/linux/devfreq.h
> >> @@ -59,6 +59,7 @@ struct devfreq_governor {
> >>   *           at each executino of devfreq_monitor, tickle is decremented.
> >>   *           User may tickle a device-devfreq in order to set maximum
> >>   *           frequency instaneously with some guaranteed duration.
> >> + * @num_tickle       number of tickle calls.
> >>   *
> >>   * This structure stores the DEVFREQ information for a give device.
> >>   */
> >> @@ -72,6 +73,8 @@ struct devfreq {
> >>       unsigned long previous_freq;
> >>       unsigned int next_polling;
> >>       unsigned int tickle;
> >> +
> >> +     unsigned int num_tickle;
> >>  };
> >>
> >>  #if defined(CONFIG_PM_DEVFREQ)
> >
> > It looks like the above two changes should be moved to a separate patch.
> 
> These two changes are for the sysfs interface as counting number of
> tickle is added with sysfs interface.

OK

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
  2011-07-04  8:34   ` MyungJoo Ham
  2011-07-04  8:57     ` Rafael J. Wysocki
@ 2011-07-04  8:57     ` Rafael J. Wysocki
  1 sibling, 0 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-07-04  8:57 UTC (permalink / raw)
  To: myungjoo.ham
  Cc: linux-pm, linux-kernel, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kyungmin Park, Jiejing Zhang, Colin Cross,
	Nishanth Menon, Thomas Gleixner, Chanwoo Choi

On Monday, July 04, 2011, MyungJoo Ham wrote:
> 2011/7/3 Rafael J. Wysocki <rjw@sisk.pl>:
> > Hi,
> >
> > I'm not sure if spelling DEVFREQ in the kerneldoc comments with capitals is
> > a good idea, it looks kind of odd.  I'm not sure what would be look better,
> > though.
> 
> Umm.. what about "devfreq" using all non-capitals? It looks fine to me as well.

Works for me.

> >
> > On Friday, May 27, 2011, MyungJoo Ham wrote:
> >
> > ...
> >> +/**
> >> + * devfreq_monitor() - Regularly run devfreq_do() and support device DEVFREQ tickle.
> >
> > I'd say "periodically" istead of "regularly".
> 
> Yes, that is the proper term. Thanks.
> 
> >
> >> + * @work: the work struct used to run devfreq_monitor periodically.
> >
> > Also please say something more about the "tickle" thinkg here.
> 
> Sure.
> 
> >
> >> + */
> >> +static void devfreq_monitor(struct work_struct *work)
> >> +{
> >> +     struct devfreq *devfreq;
> >> +     int error;
> >> +     bool continue_polling = false;
> >> +     struct devfreq *to_remove = NULL;
> >> +
> >> +     mutex_lock(&devfreq_list_lock);
> >> +
> >> +     list_for_each_entry(devfreq, &devfreq_list, node) {
> >> +             /* Remove the devfreq entry that failed */
> >> +             if (to_remove) {
> >> +                     list_del(&to_remove->node);
> >> +                     kfree(to_remove);
> >> +                     to_remove = NULL;
> >> +             }
> >> +
> >> +             /*
> >> +              * If the device is tickled and the tickle duration is left,
> >> +              * do not change the frequency for a while
> >> +              */
> >> +             if (devfreq->tickle) {
> >> +                     continue_polling = true;
> >> +                     devfreq->tickle--;
> >> +
> >> +                     /*
> >> +                      * If the tickle is ending and the device is not going
> >> +                      * to poll, force the device to poll next time so that
> >> +                      * it can return to the original frequency afterwards.
> >> +                      * However, non-polling device will have 0 polling_ms,
> >> +                      * it will not poll again later.
> >> +                      */
> >> +                     if (devfreq->tickle == 0 && devfreq->next_polling == 0)
> >> +                             devfreq->next_polling = 1;
> >> +
> >> +                     continue;
> >> +             }
> >> +
> >> +             /* This device does not require polling */
> >> +             if (devfreq->next_polling == 0)
> >> +                     continue;
> >> +
> >> +             continue_polling = true;
> >> +
> >> +             if (devfreq->next_polling == 1) {
> >> +                     /* This device is polling this time */
> >
> > I'd remove this comment, it's confusing IMO.  Besides, it may be better
> > to structure the code like this:
> >
> > if (devfreq->next_polling-- == 1) {
> > }
> >
> > and then you wouldn't need the "else" at all.
> 
> Ok, I'll try that style.
> 
> >
> >> +                     error = devfreq_do(devfreq);
> >> +                     if (error && error != -EAGAIN) {
> >> +                             /*
> >> +                              * Remove a devfreq with error. However,
> >> +                              * We cannot remove it right here because the
> >
> > Comma after "here", please.
> 
> "We" should be "we" here, but no comma there though:
> http://owl.english.purdue.edu/owl/resource/607/02/

OK, whatever.

> However, "However, because the ..... above, we cannot remove it right
> here" is correct.

Yes, it is.

> >
> >> +                              * devfreq pointer is going to be used by
> >> +                              * list_for_each_entry above. Thus, it is
> >> +                              * removed afterwards.
> >
> > Why don't you use list_for_each_entry_safe(), then?
> 
> Ah.. that's a good idea. Thanks! Now, I can reorganize this one.
> 
> >
> >> +                              */
> >> +                             to_remove = devfreq->dev;
> >> +                             dev_err(devfreq->dev, "devfreq_do error(%d). "
> >> +                                     "DEVFREQ is removed from the device\n",
> >> +                                     error);
> >> +                             continue;
> >> +                     }
> >> +                     devfreq->next_polling = DIV_ROUND_UP(
> >> +                                             devfreq->profile->polling_ms,
> >> +                                             DEVFREQ_INTERVAL);
> >> +             } else {
> >> +                     /* The device will poll later when next_polling = 1 */
> >> +                     devfreq->next_polling--;
> >> +             }
> >> +     }
> >> +
> >> +     if (to_remove) {
> >> +             list_del(&to_remove->node);
> >> +             kfree(to_remove);
> >> +             to_remove = NULL;
> >> +     }
> >> +
> >> +     if (continue_polling) {
> >> +             polling = true;
> >> +             queue_delayed_work(devfreq_wq, &devfreq_work,
> >> +                                msecs_to_jiffies(DEVFREQ_INTERVAL));
> >> +     } else {
> >> +             polling = false;
> >> +     }
> >
> > OK, so why exactly continue_polling is needed?  It seems you might siply use
> > "polling" instead of it above.
> 
> The purpose of continue_polling seems to disappear in the middle of
> development. I'll remove it.
> 
> >
> >> +
> >> +     mutex_unlock(&devfreq_list_lock);
> >> +}
> > ...
> >> +/**
> >> + * devfreq_update() - Notify that the device OPP has been changed.
> >> + * @dev:     the device whose OPP has been changed.
> >> + * @may_not_exist:   do not print error message even if the device
> >> + *                   does not have devfreq entry.
> >
> > This argument isn't used any more.
> 
> Ah.. yes. sure.
> 
> >
> >> + */
> >> +int devfreq_update(struct device *dev)
> >> +{
> >> +     struct devfreq *devfreq;
> >> +     int err = 0;
> >> +
> >> +     mutex_lock(&devfreq_list_lock);
> >> +
> >> +     devfreq = find_device_devfreq(dev);
> >> +     if (IS_ERR(devfreq)) {
> >> +             err = PTR_ERR(devfreq);
> >> +             goto out;
> >> +     }
> >> +
> >> +     if (devfreq->tickle) {
> >> +             /* If the max freq available is changed, re-tickle */
> >
> > It would be good to explain what "re-tickle" means.
> 
> Of course. I'll explain that.
> 
> >
> >> +             unsigned long freq = devfreq->profile->max_freq;
> >> +             struct opp *opp = opp_find_freq_floor(devfreq->dev, &freq);
> >> +
> >> +             if (IS_ERR(opp)) {
> >> +                     err = PTR_ERR(opp);
> >> +                     goto out;
> >> +             }
> >> +
> >> +             /* Max freq available is not changed */
> >> +             if (devfreq->previous_freq == freq)
> >> +                     goto out;
> >> +
> >> +             err = devfreq->profile->target(devfreq->dev, opp);
> >> +             if (!err)
> >> +                     devfreq->previous_freq = freq;
> >
> > Why don't we run devfreq_do() in this case?
> 
> When the list of available frequencies is updated and the device is to
> be kept at its maximum frequency, we do not need to reevaluate the
> device's activities to setup the new frequency. In such cases, we only
> need to keep it at its maximum freuqency. If the maximum frequency is
> not changed, we don't need anything to do (when "tickling" is
> completed, devfreq_do() will reevaluate automatically); otherwise,
> tickling again at the new frequency is needed (re-tickle).

OK

> >> +     } else {
> >> +             /* Reevaluate the proper frequency */
> >> +             err = devfreq_do(devfreq);
> >> +     }
> >> +
> >> +out:
> >> +     mutex_unlock(&devfreq_list_lock);
> >> +     return err;
> >> +}
> >> +
> >
> > Add a kerneldoc comment here, please.
> 
> Yes. I'll add one. (and for any other functions missing kerneldoc comments)

Well, if the function is a static one-liner, you can skip the kerneldoc IMHO
However, that one below is not really a one-liner.. :-)

> >> +static int _devfreq_tickle_device(struct devfreq *df, unsigned long delay)
> >> +{
> >> +     int err = 0;
> >> +     unsigned long freq;
> >> +     struct opp *opp;
> >> +
> >> +     freq = df->profile->max_freq;
> >> +     opp = opp_find_freq_floor(df->dev, &freq);
> >> +     if (IS_ERR(opp))
> >> +             return PTR_ERR(opp);
> >> +
> >> +     if (df->previous_freq != freq) {
> >> +             err = df->profile->target(df->dev, opp);
> >> +             if (!err)
> >> +                     df->previous_freq = freq;
> >> +     }
> >> +     if (err) {
> >> +             dev_err(df->dev, "%s: Cannot set frequency.\n", __func__);
> >> +     } else {
> >> +             df->tickle = delay;
> >> +             df->num_tickle++;
> >> +     }
> >> +
> >> +     if (devfreq_wq && !polling) {
> >> +             polling = true;
> >> +             queue_delayed_work(devfreq_wq, &devfreq_work,
> >> +                             msecs_to_jiffies(DEVFREQ_INTERVAL));
> >> +     }
> >> +
> >> +     return err;
> >> +}
> >
> > Thanks,
> > Rafael
> >
> 
> Thank you so much for the comments!

You're welcome.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs
  2011-07-04  8:34   ` MyungJoo Ham
@ 2011-07-04  8:57     ` Rafael J. Wysocki
  2011-07-04  8:57     ` Rafael J. Wysocki
  1 sibling, 0 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-07-04  8:57 UTC (permalink / raw)
  To: myungjoo.ham
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	linux-pm, Thomas Gleixner

On Monday, July 04, 2011, MyungJoo Ham wrote:
> 2011/7/3 Rafael J. Wysocki <rjw@sisk.pl>:
> > Hi,
> >
> > I'm not sure if spelling DEVFREQ in the kerneldoc comments with capitals is
> > a good idea, it looks kind of odd.  I'm not sure what would be look better,
> > though.
> 
> Umm.. what about "devfreq" using all non-capitals? It looks fine to me as well.

Works for me.

> >
> > On Friday, May 27, 2011, MyungJoo Ham wrote:
> >
> > ...
> >> +/**
> >> + * devfreq_monitor() - Regularly run devfreq_do() and support device DEVFREQ tickle.
> >
> > I'd say "periodically" istead of "regularly".
> 
> Yes, that is the proper term. Thanks.
> 
> >
> >> + * @work: the work struct used to run devfreq_monitor periodically.
> >
> > Also please say something more about the "tickle" thinkg here.
> 
> Sure.
> 
> >
> >> + */
> >> +static void devfreq_monitor(struct work_struct *work)
> >> +{
> >> +     struct devfreq *devfreq;
> >> +     int error;
> >> +     bool continue_polling = false;
> >> +     struct devfreq *to_remove = NULL;
> >> +
> >> +     mutex_lock(&devfreq_list_lock);
> >> +
> >> +     list_for_each_entry(devfreq, &devfreq_list, node) {
> >> +             /* Remove the devfreq entry that failed */
> >> +             if (to_remove) {
> >> +                     list_del(&to_remove->node);
> >> +                     kfree(to_remove);
> >> +                     to_remove = NULL;
> >> +             }
> >> +
> >> +             /*
> >> +              * If the device is tickled and the tickle duration is left,
> >> +              * do not change the frequency for a while
> >> +              */
> >> +             if (devfreq->tickle) {
> >> +                     continue_polling = true;
> >> +                     devfreq->tickle--;
> >> +
> >> +                     /*
> >> +                      * If the tickle is ending and the device is not going
> >> +                      * to poll, force the device to poll next time so that
> >> +                      * it can return to the original frequency afterwards.
> >> +                      * However, non-polling device will have 0 polling_ms,
> >> +                      * it will not poll again later.
> >> +                      */
> >> +                     if (devfreq->tickle == 0 && devfreq->next_polling == 0)
> >> +                             devfreq->next_polling = 1;
> >> +
> >> +                     continue;
> >> +             }
> >> +
> >> +             /* This device does not require polling */
> >> +             if (devfreq->next_polling == 0)
> >> +                     continue;
> >> +
> >> +             continue_polling = true;
> >> +
> >> +             if (devfreq->next_polling == 1) {
> >> +                     /* This device is polling this time */
> >
> > I'd remove this comment, it's confusing IMO.  Besides, it may be better
> > to structure the code like this:
> >
> > if (devfreq->next_polling-- == 1) {
> > }
> >
> > and then you wouldn't need the "else" at all.
> 
> Ok, I'll try that style.
> 
> >
> >> +                     error = devfreq_do(devfreq);
> >> +                     if (error && error != -EAGAIN) {
> >> +                             /*
> >> +                              * Remove a devfreq with error. However,
> >> +                              * We cannot remove it right here because the
> >
> > Comma after "here", please.
> 
> "We" should be "we" here, but no comma there though:
> http://owl.english.purdue.edu/owl/resource/607/02/

OK, whatever.

> However, "However, because the ..... above, we cannot remove it right
> here" is correct.

Yes, it is.

> >
> >> +                              * devfreq pointer is going to be used by
> >> +                              * list_for_each_entry above. Thus, it is
> >> +                              * removed afterwards.
> >
> > Why don't you use list_for_each_entry_safe(), then?
> 
> Ah.. that's a good idea. Thanks! Now, I can reorganize this one.
> 
> >
> >> +                              */
> >> +                             to_remove = devfreq->dev;
> >> +                             dev_err(devfreq->dev, "devfreq_do error(%d). "
> >> +                                     "DEVFREQ is removed from the device\n",
> >> +                                     error);
> >> +                             continue;
> >> +                     }
> >> +                     devfreq->next_polling = DIV_ROUND_UP(
> >> +                                             devfreq->profile->polling_ms,
> >> +                                             DEVFREQ_INTERVAL);
> >> +             } else {
> >> +                     /* The device will poll later when next_polling = 1 */
> >> +                     devfreq->next_polling--;
> >> +             }
> >> +     }
> >> +
> >> +     if (to_remove) {
> >> +             list_del(&to_remove->node);
> >> +             kfree(to_remove);
> >> +             to_remove = NULL;
> >> +     }
> >> +
> >> +     if (continue_polling) {
> >> +             polling = true;
> >> +             queue_delayed_work(devfreq_wq, &devfreq_work,
> >> +                                msecs_to_jiffies(DEVFREQ_INTERVAL));
> >> +     } else {
> >> +             polling = false;
> >> +     }
> >
> > OK, so why exactly continue_polling is needed?  It seems you might siply use
> > "polling" instead of it above.
> 
> The purpose of continue_polling seems to disappear in the middle of
> development. I'll remove it.
> 
> >
> >> +
> >> +     mutex_unlock(&devfreq_list_lock);
> >> +}
> > ...
> >> +/**
> >> + * devfreq_update() - Notify that the device OPP has been changed.
> >> + * @dev:     the device whose OPP has been changed.
> >> + * @may_not_exist:   do not print error message even if the device
> >> + *                   does not have devfreq entry.
> >
> > This argument isn't used any more.
> 
> Ah.. yes. sure.
> 
> >
> >> + */
> >> +int devfreq_update(struct device *dev)
> >> +{
> >> +     struct devfreq *devfreq;
> >> +     int err = 0;
> >> +
> >> +     mutex_lock(&devfreq_list_lock);
> >> +
> >> +     devfreq = find_device_devfreq(dev);
> >> +     if (IS_ERR(devfreq)) {
> >> +             err = PTR_ERR(devfreq);
> >> +             goto out;
> >> +     }
> >> +
> >> +     if (devfreq->tickle) {
> >> +             /* If the max freq available is changed, re-tickle */
> >
> > It would be good to explain what "re-tickle" means.
> 
> Of course. I'll explain that.
> 
> >
> >> +             unsigned long freq = devfreq->profile->max_freq;
> >> +             struct opp *opp = opp_find_freq_floor(devfreq->dev, &freq);
> >> +
> >> +             if (IS_ERR(opp)) {
> >> +                     err = PTR_ERR(opp);
> >> +                     goto out;
> >> +             }
> >> +
> >> +             /* Max freq available is not changed */
> >> +             if (devfreq->previous_freq == freq)
> >> +                     goto out;
> >> +
> >> +             err = devfreq->profile->target(devfreq->dev, opp);
> >> +             if (!err)
> >> +                     devfreq->previous_freq = freq;
> >
> > Why don't we run devfreq_do() in this case?
> 
> When the list of available frequencies is updated and the device is to
> be kept at its maximum frequency, we do not need to reevaluate the
> device's activities to setup the new frequency. In such cases, we only
> need to keep it at its maximum freuqency. If the maximum frequency is
> not changed, we don't need anything to do (when "tickling" is
> completed, devfreq_do() will reevaluate automatically); otherwise,
> tickling again at the new frequency is needed (re-tickle).

OK

> >> +     } else {
> >> +             /* Reevaluate the proper frequency */
> >> +             err = devfreq_do(devfreq);
> >> +     }
> >> +
> >> +out:
> >> +     mutex_unlock(&devfreq_list_lock);
> >> +     return err;
> >> +}
> >> +
> >
> > Add a kerneldoc comment here, please.
> 
> Yes. I'll add one. (and for any other functions missing kerneldoc comments)

Well, if the function is a static one-liner, you can skip the kerneldoc IMHO
However, that one below is not really a one-liner.. :-)

> >> +static int _devfreq_tickle_device(struct devfreq *df, unsigned long delay)
> >> +{
> >> +     int err = 0;
> >> +     unsigned long freq;
> >> +     struct opp *opp;
> >> +
> >> +     freq = df->profile->max_freq;
> >> +     opp = opp_find_freq_floor(df->dev, &freq);
> >> +     if (IS_ERR(opp))
> >> +             return PTR_ERR(opp);
> >> +
> >> +     if (df->previous_freq != freq) {
> >> +             err = df->profile->target(df->dev, opp);
> >> +             if (!err)
> >> +                     df->previous_freq = freq;
> >> +     }
> >> +     if (err) {
> >> +             dev_err(df->dev, "%s: Cannot set frequency.\n", __func__);
> >> +     } else {
> >> +             df->tickle = delay;
> >> +             df->num_tickle++;
> >> +     }
> >> +
> >> +     if (devfreq_wq && !polling) {
> >> +             polling = true;
> >> +             queue_delayed_work(devfreq_wq, &devfreq_work,
> >> +                             msecs_to_jiffies(DEVFREQ_INTERVAL));
> >> +     }
> >> +
> >> +     return err;
> >> +}
> >
> > Thanks,
> > Rafael
> >
> 
> Thank you so much for the comments!

You're welcome.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 2/3] PM / DEVFREQ: add example governors
  2011-07-04  8:43       ` Rafael J. Wysocki
@ 2011-07-04  8:58         ` MyungJoo Ham
  2011-07-07 21:08           ` Rafael J. Wysocki
  2011-07-07 21:08           ` Rafael J. Wysocki
  2011-07-04  8:58         ` MyungJoo Ham
  1 sibling, 2 replies; 35+ messages in thread
From: MyungJoo Ham @ 2011-07-04  8:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, linux-kernel, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kyungmin Park, Jiejing Zhang, Colin Cross,
	Nishanth Menon, Thomas Gleixner

On Mon, Jul 4, 2011 at 5:43 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Monday, July 04, 2011, MyungJoo Ham wrote:
>> Hello,
>>
>> 2011/7/3 Rafael J. Wysocki <rjw@sisk.pl>:
>> > Hi,
>> >
>> > On Friday, May 27, 2011, MyungJoo Ham wrote:
>> >> Three CPUFREQ-like governors are provided as examples.
>> >>
>> >> powersave: use the lowest frequency possible. The user (device) should
>> >> set the polling_ms as 0 because polling is useless for this governor.
>> >>
>> >> performance: use the highest freqeuncy possible. The user (device)
>> >> should set the polling_ms as 0 because polling is useless for this
>> >> governor.
>> >>
>> >> simple_ondemand: simplified version of CPUFREQ's ONDEMAND governor.
>> >>
>> >> When a user updates OPP entries (enable/disable/add), OPP framework
>> >> automatically notifies DEVFREQ to update operating frequency
>> >> accordingly. Thus, DEVFREQ users (device drivers) do not need to update
>> >> DEVFREQ manually with OPP entry updates or set polling_ms for powersave
>> >> , performance, or any other "static" governors.
>> >>
>> >> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
>> >> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>> []
>> >> +
>> >> +     /* Set the desired frequency based on the load */
>> >> +     a = (unsigned long long) stat.busy_time * stat.current_frequency;
>> >
>> > What's the purpose of the conversion?
>>
>> Assuming that the work speed of a device is proportional to its
>> frequency, it measures the amount of work done.
>> It's time * work/time. For example, during the last 10 second, if the
>> busy_time was 5 sec and frequency was 10MHz,
>> it's "50M", which is same as 20MHz and 2.5 sec.
>
> I understand that, but my question was why you're doing a forced conversion
> to (unsigned long long).

Ah.. that was for the 64bit operations.

Both busy_time and current_frequency are 32bit and current_frequency
may be a big number.

Thus, in order to get "freq" value without losing bits (e.g., if
current_frequency = 1GHz and busy_time = 8000, we get an overflow
without 64bit operations), I've inserted 64bit operations with the
conversion. For the cosmetic reasons, it appears that "u64" looks
better though.

>
> Thanks,
> Rafael
>

Thank you.

- MyungJoo


-- 
MyungJoo Ham, Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 2/3] PM / DEVFREQ: add example governors
  2011-07-04  8:43       ` Rafael J. Wysocki
  2011-07-04  8:58         ` MyungJoo Ham
@ 2011-07-04  8:58         ` MyungJoo Ham
  1 sibling, 0 replies; 35+ messages in thread
From: MyungJoo Ham @ 2011-07-04  8:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	linux-pm, Thomas Gleixner

On Mon, Jul 4, 2011 at 5:43 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Monday, July 04, 2011, MyungJoo Ham wrote:
>> Hello,
>>
>> 2011/7/3 Rafael J. Wysocki <rjw@sisk.pl>:
>> > Hi,
>> >
>> > On Friday, May 27, 2011, MyungJoo Ham wrote:
>> >> Three CPUFREQ-like governors are provided as examples.
>> >>
>> >> powersave: use the lowest frequency possible. The user (device) should
>> >> set the polling_ms as 0 because polling is useless for this governor.
>> >>
>> >> performance: use the highest freqeuncy possible. The user (device)
>> >> should set the polling_ms as 0 because polling is useless for this
>> >> governor.
>> >>
>> >> simple_ondemand: simplified version of CPUFREQ's ONDEMAND governor.
>> >>
>> >> When a user updates OPP entries (enable/disable/add), OPP framework
>> >> automatically notifies DEVFREQ to update operating frequency
>> >> accordingly. Thus, DEVFREQ users (device drivers) do not need to update
>> >> DEVFREQ manually with OPP entry updates or set polling_ms for powersave
>> >> , performance, or any other "static" governors.
>> >>
>> >> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
>> >> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>> []
>> >> +
>> >> +     /* Set the desired frequency based on the load */
>> >> +     a = (unsigned long long) stat.busy_time * stat.current_frequency;
>> >
>> > What's the purpose of the conversion?
>>
>> Assuming that the work speed of a device is proportional to its
>> frequency, it measures the amount of work done.
>> It's time * work/time. For example, during the last 10 second, if the
>> busy_time was 5 sec and frequency was 10MHz,
>> it's "50M", which is same as 20MHz and 2.5 sec.
>
> I understand that, but my question was why you're doing a forced conversion
> to (unsigned long long).

Ah.. that was for the 64bit operations.

Both busy_time and current_frequency are 32bit and current_frequency
may be a big number.

Thus, in order to get "freq" value without losing bits (e.g., if
current_frequency = 1GHz and busy_time = 8000, we get an overflow
without 64bit operations), I've inserted 64bit operations with the
conversion. For the cosmetic reasons, it appears that "u64" looks
better though.

>
> Thanks,
> Rafael
>

Thank you.

- MyungJoo


-- 
MyungJoo Ham, Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 2/3] PM / DEVFREQ: add example governors
  2011-07-04  8:58         ` MyungJoo Ham
@ 2011-07-07 21:08           ` Rafael J. Wysocki
  2011-07-07 21:08           ` Rafael J. Wysocki
  1 sibling, 0 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-07-07 21:08 UTC (permalink / raw)
  To: MyungJoo Ham
  Cc: linux-pm, linux-kernel, Len Brown, Pavel Machek,
	Greg Kroah-Hartman, Kyungmin Park, Jiejing Zhang, Colin Cross,
	Nishanth Menon, Thomas Gleixner

On Monday, July 04, 2011, MyungJoo Ham wrote:
> On Mon, Jul 4, 2011 at 5:43 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Monday, July 04, 2011, MyungJoo Ham wrote:
> >> Hello,
> >>
> >> 2011/7/3 Rafael J. Wysocki <rjw@sisk.pl>:
> >> > Hi,
> >> >
> >> > On Friday, May 27, 2011, MyungJoo Ham wrote:
> >> >> Three CPUFREQ-like governors are provided as examples.
> >> >>
> >> >> powersave: use the lowest frequency possible. The user (device) should
> >> >> set the polling_ms as 0 because polling is useless for this governor.
> >> >>
> >> >> performance: use the highest freqeuncy possible. The user (device)
> >> >> should set the polling_ms as 0 because polling is useless for this
> >> >> governor.
> >> >>
> >> >> simple_ondemand: simplified version of CPUFREQ's ONDEMAND governor.
> >> >>
> >> >> When a user updates OPP entries (enable/disable/add), OPP framework
> >> >> automatically notifies DEVFREQ to update operating frequency
> >> >> accordingly. Thus, DEVFREQ users (device drivers) do not need to update
> >> >> DEVFREQ manually with OPP entry updates or set polling_ms for powersave
> >> >> , performance, or any other "static" governors.
> >> >>
> >> >> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
> >> >> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> >> []
> >> >> +
> >> >> +     /* Set the desired frequency based on the load */
> >> >> +     a = (unsigned long long) stat.busy_time * stat.current_frequency;
> >> >
> >> > What's the purpose of the conversion?
> >>
> >> Assuming that the work speed of a device is proportional to its
> >> frequency, it measures the amount of work done.
> >> It's time * work/time. For example, during the last 10 second, if the
> >> busy_time was 5 sec and frequency was 10MHz,
> >> it's "50M", which is same as 20MHz and 2.5 sec.
> >
> > I understand that, but my question was why you're doing a forced conversion
> > to (unsigned long long).
> 
> Ah.. that was for the 64bit operations.
> 
> Both busy_time and current_frequency are 32bit and current_frequency
> may be a big number.
> 
> Thus, in order to get "freq" value without losing bits (e.g., if
> current_frequency = 1GHz and busy_time = 8000, we get an overflow
> without 64bit operations), I've inserted 64bit operations with the
> conversion. For the cosmetic reasons, it appears that "u64" looks
> better though.

You wouldn't need the explicit type casting if you did

a = stat.busy_time;
a *= stat.current_frequency;

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 2/3] PM / DEVFREQ: add example governors
  2011-07-04  8:58         ` MyungJoo Ham
  2011-07-07 21:08           ` Rafael J. Wysocki
@ 2011-07-07 21:08           ` Rafael J. Wysocki
  1 sibling, 0 replies; 35+ messages in thread
From: Rafael J. Wysocki @ 2011-07-07 21:08 UTC (permalink / raw)
  To: MyungJoo Ham
  Cc: Len Brown, Greg Kroah-Hartman, linux-kernel, Kyungmin Park,
	linux-pm, Thomas Gleixner

On Monday, July 04, 2011, MyungJoo Ham wrote:
> On Mon, Jul 4, 2011 at 5:43 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Monday, July 04, 2011, MyungJoo Ham wrote:
> >> Hello,
> >>
> >> 2011/7/3 Rafael J. Wysocki <rjw@sisk.pl>:
> >> > Hi,
> >> >
> >> > On Friday, May 27, 2011, MyungJoo Ham wrote:
> >> >> Three CPUFREQ-like governors are provided as examples.
> >> >>
> >> >> powersave: use the lowest frequency possible. The user (device) should
> >> >> set the polling_ms as 0 because polling is useless for this governor.
> >> >>
> >> >> performance: use the highest freqeuncy possible. The user (device)
> >> >> should set the polling_ms as 0 because polling is useless for this
> >> >> governor.
> >> >>
> >> >> simple_ondemand: simplified version of CPUFREQ's ONDEMAND governor.
> >> >>
> >> >> When a user updates OPP entries (enable/disable/add), OPP framework
> >> >> automatically notifies DEVFREQ to update operating frequency
> >> >> accordingly. Thus, DEVFREQ users (device drivers) do not need to update
> >> >> DEVFREQ manually with OPP entry updates or set polling_ms for powersave
> >> >> , performance, or any other "static" governors.
> >> >>
> >> >> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
> >> >> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> >> []
> >> >> +
> >> >> +     /* Set the desired frequency based on the load */
> >> >> +     a = (unsigned long long) stat.busy_time * stat.current_frequency;
> >> >
> >> > What's the purpose of the conversion?
> >>
> >> Assuming that the work speed of a device is proportional to its
> >> frequency, it measures the amount of work done.
> >> It's time * work/time. For example, during the last 10 second, if the
> >> busy_time was 5 sec and frequency was 10MHz,
> >> it's "50M", which is same as 20MHz and 2.5 sec.
> >
> > I understand that, but my question was why you're doing a forced conversion
> > to (unsigned long long).
> 
> Ah.. that was for the 64bit operations.
> 
> Both busy_time and current_frequency are 32bit and current_frequency
> may be a big number.
> 
> Thus, in order to get "freq" value without losing bits (e.g., if
> current_frequency = 1GHz and busy_time = 8000, we get an overflow
> without 64bit operations), I've inserted 64bit operations with the
> conversion. For the cosmetic reasons, it appears that "u64" looks
> better though.

You wouldn't need the explicit type casting if you did

a = stat.busy_time;
a *= stat.current_frequency;

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2011-07-07 21:08 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-27  4:53 [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs MyungJoo Ham
2011-05-27  4:53 ` [PATCH v3 2/3] PM / DEVFREQ: add example governors MyungJoo Ham
2011-07-02 21:58   ` Rafael J. Wysocki
2011-07-04  1:37     ` MyungJoo Ham
2011-07-04  1:37       ` MyungJoo Ham
2011-07-04  8:43       ` Rafael J. Wysocki
2011-07-04  8:43       ` Rafael J. Wysocki
2011-07-04  8:58         ` MyungJoo Ham
2011-07-07 21:08           ` Rafael J. Wysocki
2011-07-07 21:08           ` Rafael J. Wysocki
2011-07-04  8:58         ` MyungJoo Ham
2011-07-02 21:58   ` Rafael J. Wysocki
2011-05-27  4:53 ` MyungJoo Ham
2011-05-27  4:53 ` [PATCH v3 3/3] PM / DEVFREQ: add sysfs interface (including user tickling) MyungJoo Ham
2011-07-02 22:13   ` Rafael J. Wysocki
2011-07-02 22:13   ` Rafael J. Wysocki
2011-07-04  7:15     ` MyungJoo Ham
2011-07-04  8:48       ` Rafael J. Wysocki
2011-07-04  8:48       ` Rafael J. Wysocki
2011-07-04  7:15     ` MyungJoo Ham
2011-05-27  4:53 ` MyungJoo Ham
2011-05-28  8:57 ` [PATCH v3 1/3] PM: Introduce DEVFREQ: generic DVFS framework with device-specific OPPs Nishanth Menon
2011-05-30  5:03   ` MyungJoo Ham
2011-06-16 21:11     ` Rafael J. Wysocki
2011-07-02 21:30       ` Rafael J. Wysocki
2011-07-02 21:30       ` Rafael J. Wysocki
2011-06-16 21:11     ` Rafael J. Wysocki
2011-05-30  5:03   ` MyungJoo Ham
2011-05-28  8:57 ` Nishanth Menon
2011-07-02 21:56 ` Rafael J. Wysocki
2011-07-02 21:56 ` Rafael J. Wysocki
2011-07-04  8:34   ` MyungJoo Ham
2011-07-04  8:57     ` Rafael J. Wysocki
2011-07-04  8:57     ` Rafael J. Wysocki
2011-07-04  8:34   ` MyungJoo Ham

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.