linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] Add P state driver for Intel Core Processors
@ 2013-02-05 18:23 dirk.brandewie
  2013-02-05 18:24 ` [PATCH 1/7] cpufreq: Don't remove sysfs link for policy->cpu dirk.brandewie
                   ` (6 more replies)
  0 siblings, 7 replies; 25+ messages in thread
From: dirk.brandewie @ 2013-02-05 18:23 UTC (permalink / raw)
  To: linux-kernel, cpufreq; +Cc: Dirk Brandewie

From: Dirk Brandewie <dirk.brandewie@gmail.com>

This driver implements a scaling driver with an internal governor for
Intel Core processors.  The driver follows the same model as the
Transmeta scaling driver (longrun.c) and implements the setpolicy()
instead of target().  Scaling drivers that implement setpolicy() are
assmuned to implement internal governors by the cpufreq core. All the
logic for selecting the current P state is contained within the driver
no external governor is used by the cpufreq core.

At the moment only Intel SandyBridge processors are supported. As
testing on SandyBridge+ processors is completed support will be added
to the driver.

New sysfs files for controlling P state selection have been added to
/sys/devices/system/cpu/intel_pstate/
      max_perf_pct: limits the maximum P state that will be requested by
      the driver stated as a percentage of the avail performance.
    
      min_perf_pct: limits the minimum P state that will be  requested by
      the driver stated as a percentage of the avail performance.
    
      no_turbo: limits the driver to selecting P states below the turbo
      frequency range.

The units for these for these files are purposely abstract and stated
in terms of available performance and not frequency.  In idea that
frequency can be set to a single frequency is a fiction for Intel Core
processors. Even if the scaling driver selects a single P state the
actual frequency the processor will run at is selected by the
processor

This patch set is based on Rafael's bleeding edge branch commit df9d5b7c
Patch 1:
    Viresh's patch for the sake of completeness

patch 2-5:
    Fix issues related to scaling drivers that implement the
    setpolicy() interface instead of target()

patch 6:
    Fix cpufreq_stats to handle the scaling driver not exposing a
    frequency table while processing CPU_DOWN_PREPARE notification
Patch 7:
    The driver itself

Dirk Brandewie (6):
  cpufreq: Retrieve current frequency from scaling drivers with
    internal governors
  cpufreq: Only query drivers that implement cpufreq_driver.target()
  cpufreq: Do not track governor name for scaling drivers with internal
    governors.
  cpufreq: balance out cpufreq_cpu_{get,put} for scaling drivers using
    setpolicy
  cpufreq_stats: do not remove sysfs files if frequency table is not
    present
  cpufreq/x86: Add P-state driver for sandy bridge.

Viresh Kumar (1):
  cpufreq: Don't remove sysfs link for policy->cpu

 drivers/cpufreq/Kconfig.x86     |   18 +
 drivers/cpufreq/Makefile        |    1 +
 drivers/cpufreq/cpufreq.c       |   21 +-
 drivers/cpufreq/cpufreq_stats.c |    5 +
 drivers/cpufreq/intel_pstate.c  |  829 +++++++++++++++++++++++++++++++++++++++
 5 files changed, 870 insertions(+), 4 deletions(-)
 create mode 100644 drivers/cpufreq/intel_pstate.c

-- 
1.7.7.6


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/7] cpufreq: Don't remove sysfs link for policy->cpu
  2013-02-05 18:23 [PATCH 0/7] Add P state driver for Intel Core Processors dirk.brandewie
@ 2013-02-05 18:24 ` dirk.brandewie
  2013-02-06  1:42   ` Viresh Kumar
  2013-02-05 18:24 ` [PATCH 2/7] cpufreq: Retrieve current frequency from scaling drivers with internal governors dirk.brandewie
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: dirk.brandewie @ 2013-02-05 18:24 UTC (permalink / raw)
  To: linux-kernel, cpufreq; +Cc: Viresh Kumar

From: Viresh Kumar <viresh.kumar@linaro.org>

"cpufreq" directory in policy->cpu is never created using sysfs_create_link(),
but using kobject_init_and_add(). And so we shouldn't call sysfs_remove_link()
for policy->cpu(). sysfs stuff for policy->cpu is automatically removed when we
call kobject_put() for dying policy.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dirk Brandewie <dirk.brandewie@gmail.com>
---
 drivers/cpufreq/cpufreq.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 9656420..2817c3c 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1058,7 +1058,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 	cpus = cpumask_weight(data->cpus);
 	cpumask_clear_cpu(cpu, data->cpus);
 
-	if (unlikely((cpu == data->cpu) && (cpus > 1))) {
+	if (cpu != data->cpu) {
+		sysfs_remove_link(&dev->kobj, "cpufreq");
+	} else if (cpus > 1) {
 		/* first sibling now owns the new sysfs dir */
 		cpu_dev = get_cpu_device(cpumask_first(data->cpus));
 		sysfs_remove_link(&cpu_dev->kobj, "cpufreq");
@@ -1083,7 +1085,6 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 	pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
 	cpufreq_cpu_put(data);
 	unlock_policy_rwsem_write(cpu);
-	sysfs_remove_link(&dev->kobj, "cpufreq");
 
 	/* If cpu is last user of policy, free policy */
 	if (cpus == 1) {
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/7] cpufreq: Retrieve current frequency from scaling drivers with internal governors
  2013-02-05 18:23 [PATCH 0/7] Add P state driver for Intel Core Processors dirk.brandewie
  2013-02-05 18:24 ` [PATCH 1/7] cpufreq: Don't remove sysfs link for policy->cpu dirk.brandewie
@ 2013-02-05 18:24 ` dirk.brandewie
  2013-02-06  1:41   ` Viresh Kumar
  2013-02-06  1:45   ` Viresh Kumar
  2013-02-05 18:24 ` [PATCH 3/7] cpufreq: Only query drivers that implement cpufreq_driver.target() dirk.brandewie
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 25+ messages in thread
From: dirk.brandewie @ 2013-02-05 18:24 UTC (permalink / raw)
  To: linux-kernel, cpufreq; +Cc: Dirk Brandewie, Dirk Brandewie

From: Dirk Brandewie <dirk.brandewie@gmail.com>

Scaling drivers that implement the cpufreq_driver.setpolicy() versus
the cpufreq_driver.target() interface do not set policy->cur.

Normally policy->cur is set during the call to cpufreq_driver.target()
when the frequnecy request is made by the governor.

If the scaling driver implements cpufreq_driver.setpolicy() and
cpufreq_driver.get() interfaces use cpufreq_driver.get() to retrieve
the current frequency.

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
From: Dirk Brandewie <dirk.brandewie@gmail.com>
Date: 2013-02-04 18:44:40 -0800

cpufreq: balance out cpufreq_cpu_{get,put} for scaling drivers using setpolicy

There is an additional reference added to the driver in
cpufreq_add_dev()  that is removed in__cpufreq_governor() if the
driver implements target().  Remove the last reference when the
driver implements setpolicy()

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
---
 drivers/cpufreq/cpufreq.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 2817c3c..96bc302 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1181,6 +1181,13 @@ unsigned int cpufreq_quick_get(unsigned int cpu)
 	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
 	unsigned int ret_freq = 0;
 
+	if (cpufreq_driver && cpufreq_driver->setpolicy &&
+		cpufreq_driver->get) {
+		ret_freq = cpufreq_driver->get(cpu);
+		cpufreq_cpu_put(policy);
+		return ret_freq;
+	}
+
 	if (policy) {
 		ret_freq = policy->cur;
 		cpufreq_cpu_put(policy);
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/7] cpufreq: Only query drivers that implement cpufreq_driver.target()
  2013-02-05 18:23 [PATCH 0/7] Add P state driver for Intel Core Processors dirk.brandewie
  2013-02-05 18:24 ` [PATCH 1/7] cpufreq: Don't remove sysfs link for policy->cpu dirk.brandewie
  2013-02-05 18:24 ` [PATCH 2/7] cpufreq: Retrieve current frequency from scaling drivers with internal governors dirk.brandewie
@ 2013-02-05 18:24 ` dirk.brandewie
  2013-02-06  1:47   ` Viresh Kumar
  2013-02-05 18:24 ` [PATCH 4/7] cpufreq: Do not track governor name for scaling drivers with internal governors dirk.brandewie
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: dirk.brandewie @ 2013-02-05 18:24 UTC (permalink / raw)
  To: linux-kernel, cpufreq; +Cc: Dirk Brandewie, Dirk Brandewie

From: Dirk Brandewie <dirk.brandewie@gmail.com>

Scaling drivers that implement cpufreq_driver.setpolicy() have
internal governors and may/will change the current operating frequency
very frequently this will cause cpufreq_out_of_sync() to be called
every time. Only call cpufreq_driver.get() for drivers that implement
cpufreq_driver.target()

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
---
 drivers/cpufreq/cpufreq.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 96bc302..d8daa4b 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1787,7 +1787,7 @@ int cpufreq_update_policy(unsigned int cpu)
 
 	/* BIOS might change freq behind our back
 	  -> ask driver for current freq and notify governors about a change */
-	if (cpufreq_driver->get) {
+	if (cpufreq_driver->get && cpufreq_driver->target) {
 		policy.cur = cpufreq_driver->get(cpu);
 		if (!data->cur) {
 			pr_debug("Driver did not initialize current freq");
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/7] cpufreq: Do not track governor name for scaling drivers with internal governors.
  2013-02-05 18:23 [PATCH 0/7] Add P state driver for Intel Core Processors dirk.brandewie
                   ` (2 preceding siblings ...)
  2013-02-05 18:24 ` [PATCH 3/7] cpufreq: Only query drivers that implement cpufreq_driver.target() dirk.brandewie
@ 2013-02-05 18:24 ` dirk.brandewie
  2013-02-06  1:50   ` Viresh Kumar
  2013-02-05 18:24 ` [PATCH 5/7] cpufreq: balance out cpufreq_cpu_{get,put} for scaling drivers using setpolicy dirk.brandewie
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: dirk.brandewie @ 2013-02-05 18:24 UTC (permalink / raw)
  To: linux-kernel, cpufreq; +Cc: Dirk Brandewie, Dirk Brandewie

From: Dirk Brandewie <dirk.brandewie@gmail.com>

Scaling drivers that implement internal governors do not have governor
structures assocaited with them.  Only track the name of the governor
associated with the CPU if the driver does not implement
cpufreq_driver.setpolicy()

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
---
 drivers/cpufreq/cpufreq.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index d8daa4b..622e282 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1050,7 +1050,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 		__cpufreq_governor(data, CPUFREQ_GOV_STOP);
 
 #ifdef CONFIG_HOTPLUG_CPU
-	strncpy(per_cpu(cpufreq_cpu_governor, cpu), data->governor->name,
+	if (!cpufreq_driver->setpolicy)
+		strncpy(per_cpu(cpufreq_cpu_governor, cpu),
+			data->governor->name,
 			CPUFREQ_NAME_LEN);
 #endif
 
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/7] cpufreq: balance out cpufreq_cpu_{get,put} for scaling drivers using setpolicy
  2013-02-05 18:23 [PATCH 0/7] Add P state driver for Intel Core Processors dirk.brandewie
                   ` (3 preceding siblings ...)
  2013-02-05 18:24 ` [PATCH 4/7] cpufreq: Do not track governor name for scaling drivers with internal governors dirk.brandewie
@ 2013-02-05 18:24 ` dirk.brandewie
  2013-02-06  1:58   ` Viresh Kumar
  2013-02-05 18:24 ` [PATCH 6/7] cpufreq_stats: do not remove sysfs files if frequency table is not present dirk.brandewie
  2013-02-05 18:24 ` [PATCH 7/7] cpufreq/x86: Add P-state driver for sandy bridge dirk.brandewie
  6 siblings, 1 reply; 25+ messages in thread
From: dirk.brandewie @ 2013-02-05 18:24 UTC (permalink / raw)
  To: linux-kernel, cpufreq; +Cc: Dirk Brandewie, Dirk Brandewie

From: Dirk Brandewie <dirk.brandewie@gmail.com>

There is an additional reference added to the driver in
cpufreq_add_dev()  that is removed in__cpufreq_governor() if the
driver implements target().  Remove the last reference when the
driver implements setpolicy()

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
---
 drivers/cpufreq/cpufreq.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 622e282..d17477b 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1049,6 +1049,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 	if (cpufreq_driver->target)
 		__cpufreq_governor(data, CPUFREQ_GOV_STOP);
 
+	if (cpufreq_driver->setpolicy)
+		cpufreq_cpu_put(data);
+
 #ifdef CONFIG_HOTPLUG_CPU
 	if (!cpufreq_driver->setpolicy)
 		strncpy(per_cpu(cpufreq_cpu_governor, cpu),
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 6/7] cpufreq_stats: do not remove sysfs files if frequency table is not present
  2013-02-05 18:23 [PATCH 0/7] Add P state driver for Intel Core Processors dirk.brandewie
                   ` (4 preceding siblings ...)
  2013-02-05 18:24 ` [PATCH 5/7] cpufreq: balance out cpufreq_cpu_{get,put} for scaling drivers using setpolicy dirk.brandewie
@ 2013-02-05 18:24 ` dirk.brandewie
  2013-02-06  2:18   ` Viresh Kumar
  2013-02-05 18:24 ` [PATCH 7/7] cpufreq/x86: Add P-state driver for sandy bridge dirk.brandewie
  6 siblings, 1 reply; 25+ messages in thread
From: dirk.brandewie @ 2013-02-05 18:24 UTC (permalink / raw)
  To: linux-kernel, cpufreq; +Cc: Dirk Brandewie, Dirk Brandewie

From: Dirk Brandewie <dirk.brandewie@gmail.com>

The sysfs files for cpufreq_stats are created in cpufreq_stats_create_table()
called from cpufreq_stat_notifier_policy() when a policy is added to
the cpu. cpufreq_stats_create_table() will not be called if the
scaling driver does not export a frequency table to cpufreq.  Use the
same fence on tear down.

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
---
 drivers/cpufreq/cpufreq_stats.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
index 572124c..fb001e2 100644
--- a/drivers/cpufreq/cpufreq_stats.c
+++ b/drivers/cpufreq/cpufreq_stats.c
@@ -185,6 +185,11 @@ static void cpufreq_stats_free_table(unsigned int cpu)
 static void cpufreq_stats_free_sysfs(unsigned int cpu)
 {
 	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
+	struct cpufreq_frequency_table *table;
+
+	table = cpufreq_frequency_get_table(cpu);
+	if (!table)
+		return;
 	if (policy && !policy_is_shared(policy)) {
 		pr_debug("%s: Free sysfs stat\n", __func__);
 		sysfs_remove_group(&policy->kobj, &stats_attr_group);
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 7/7] cpufreq/x86: Add P-state driver for sandy bridge.
  2013-02-05 18:23 [PATCH 0/7] Add P state driver for Intel Core Processors dirk.brandewie
                   ` (5 preceding siblings ...)
  2013-02-05 18:24 ` [PATCH 6/7] cpufreq_stats: do not remove sysfs files if frequency table is not present dirk.brandewie
@ 2013-02-05 18:24 ` dirk.brandewie
  6 siblings, 0 replies; 25+ messages in thread
From: dirk.brandewie @ 2013-02-05 18:24 UTC (permalink / raw)
  To: linux-kernel, cpufreq; +Cc: Dirk Brandewie, Dirk Brandewie

From: Dirk Brandewie <dirk.brandewie@gmail.com>

Add a P-state driver for the Intel Sandy bridge processor. In cpufreq
terminology this driver implements a  scaling driver with an internal
governor.

When built into the the kernel this driver will be the preferred
scaling driver for Sandy bridge processors.

In addition to the interfaces provided by the cpufreq subsystem for
controlling scaling drivers. The user may control the behavior of the
driver via three sysfs files located in
"/sys/devices/system/cpu/intel_pstate".

  max_perf_pct: limits the maximum P state that will be requested by
  the driver stated as a percentage of the avail performance.

  min_perf_pct: limits the minimum P state that will be  requested by
  the driver stated as a percentage of the avail performance.

  no_turbo: limits the driver to selecting P states below the turbo
  frequency range.

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
---
 drivers/cpufreq/Kconfig.x86    |   18 +
 drivers/cpufreq/Makefile       |    1 +
 drivers/cpufreq/intel_pstate.c |  829 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 848 insertions(+), 0 deletions(-)
 create mode 100644 drivers/cpufreq/intel_pstate.c

diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86
index 7227cd7..6aa7053 100644
--- a/drivers/cpufreq/Kconfig.x86
+++ b/drivers/cpufreq/Kconfig.x86
@@ -2,6 +2,24 @@
 # x86 CPU Frequency scaling drivers
 #
 
+config X86_INTEL_PSTATE
+       tristate "Intel P state control"
+       depends on X86
+       help
+          This driver provides a P state for Intel core processors.
+	  The driver implements an internal governor and will become
+          the scaling driver and governor for Sandy bridge processors.
+
+	  When this driver is enabled it will become the perferred
+          scaling driver for Sandy bridge processors.
+
+	  Note: This driver should be built with the same settings as
+	  the other scaling drivers configured into the system
+	  (module/built-in) in order for the driver to register itself
+	  as the scaling driver on the system.
+
+	  If in doubt, say N.
+
 config X86_PCC_CPUFREQ
 	tristate "Processor Clocking Control interface driver"
 	depends on ACPI && ACPI_PROCESSOR
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index f7aed54..39ad489 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -40,6 +40,7 @@ obj-$(CONFIG_X86_SPEEDSTEP_SMI)		+= speedstep-smi.o
 obj-$(CONFIG_X86_SPEEDSTEP_CENTRINO)	+= speedstep-centrino.o
 obj-$(CONFIG_X86_P4_CLOCKMOD)		+= p4-clockmod.o
 obj-$(CONFIG_X86_CPUFREQ_NFORCE2)	+= cpufreq-nforce2.o
+obj-$(CONFIG_X86_INTEL_PSTATE)		+= intel_pstate.o
 
 ##################################################################################
 # ARM SoC drivers
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
new file mode 100644
index 0000000..21cee6a
--- /dev/null
+++ b/drivers/cpufreq/intel_pstate.c
@@ -0,0 +1,829 @@
+/*
+ * cpufreq_snb.c: Native P state management for Intel processors
+ *
+ * (C) Copyright 2012 Intel Corporation
+ * Author: Dirk Brandewie <dirk.j.brandewie@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ */
+
+#include <linux/kernel.h>
+#include <linux/kernel_stat.h>
+#include <linux/module.h>
+#include <linux/ktime.h>
+#include <linux/hrtimer.h>
+#include <linux/tick.h>
+#include <linux/slab.h>
+#include <linux/sched.h>
+#include <linux/list.h>
+#include <linux/cpu.h>
+#include <linux/cpufreq.h>
+#include <linux/sysfs.h>
+#include <linux/types.h>
+#include <linux/fs.h>
+#include <linux/debugfs.h>
+#include <trace/events/power.h>
+
+#include <asm/div64.h>
+#include <asm/msr.h>
+#include <asm/cpu_device_id.h>
+
+#define SAMPLE_COUNT		3
+
+#define FRAC_BITS 8
+#define int_tofp(X) ((int64_t)(X) << FRAC_BITS)
+#define fp_toint(X) ((X) >> FRAC_BITS)
+
+static inline int32_t mul_fp(int32_t x, int32_t y)
+{
+	return ((int64_t)x * (int64_t)y) >> FRAC_BITS;
+}
+
+static inline int32_t div_fp(int32_t x, int32_t y)
+{
+	return div_s64((int64_t)x << FRAC_BITS, (int64_t)y);
+}
+
+struct sample {
+	ktime_t start_time;
+	ktime_t end_time;
+	int core_pct_busy;
+	int pstate_pct_busy;
+	u64 duration_us;
+	u64 idletime_us;
+	u64 aperf;
+	u64 mperf;
+	int freq;
+};
+
+struct pstate_data {
+	int	current_pstate;
+	int	min_pstate;
+	int	max_pstate;
+	int	turbo_pstate;
+};
+
+struct _pid {
+	int setpoint;
+	int32_t integral;
+	int32_t p_gain;
+	int32_t i_gain;
+	int32_t d_gain;
+	int deadband;
+	int last_err;
+};
+
+struct cpudata {
+	int cpu;
+
+	char name[64];
+
+	struct timer_list timer;
+
+	struct pstate_adjust_policy *pstate_policy;
+	struct pstate_data pstate;
+	struct _pid pid;
+	struct _pid idle_pid;
+
+	int min_pstate_count;
+	int idle_mode;
+
+	ktime_t prev_sample;
+	u64	prev_idle_time_us;
+	u64	prev_aperf;
+	u64	prev_mperf;
+	int	sample_ptr;
+	struct sample samples[SAMPLE_COUNT];
+};
+
+struct cpudata **all_cpu_data;
+struct pstate_adjust_policy {
+	int sample_rate_ms;
+	int deadband;
+	int setpoint;
+	int p_gain_pct;
+	int d_gain_pct;
+	int i_gain_pct;
+};
+
+static struct pstate_adjust_policy default_policy = {
+	.sample_rate_ms = 10,
+	.deadband = 0,
+	.setpoint = 109,
+	.p_gain_pct = 17,
+	.d_gain_pct = 0,
+	.i_gain_pct = 4,
+};
+
+struct perf_limits {
+	int no_turbo;
+	int max_perf_pct;
+	int min_perf_pct;
+	int32_t max_perf;
+	int32_t min_perf;
+};
+
+static struct perf_limits limits = {
+	.no_turbo = 0,
+	.max_perf_pct = 100,
+	.max_perf = int_tofp(1),
+	.min_perf_pct = 0,
+	.min_perf = 0,
+};
+
+static inline void pid_reset(struct _pid *pid, int setpoint, int busy,
+			int deadband, int integral) {
+	pid->setpoint = setpoint;
+	pid->deadband  = deadband;
+	pid->integral  = int_tofp(integral);
+	pid->last_err  = setpoint - busy;
+}
+
+static inline void pid_p_gain_set(struct _pid *pid, int percent)
+{
+	pid->p_gain = div_fp(int_tofp(percent), int_tofp(100));
+}
+
+static inline void pid_i_gain_set(struct _pid *pid, int percent)
+{
+	pid->i_gain = div_fp(int_tofp(percent), int_tofp(100));
+}
+
+static inline void pid_d_gain_set(struct _pid *pid, int percent)
+{
+
+	pid->d_gain = div_fp(int_tofp(percent), int_tofp(100));
+}
+
+static  signed int pid_calc(struct _pid *pid, int busy)
+{
+	signed int err, result;
+	int32_t pterm, dterm, fp_error;
+	int32_t integral_limit;
+
+
+	err = pid->setpoint - busy;
+	fp_error = int_tofp(err);
+
+	if (abs(err) <= pid->deadband)
+		return 0;
+
+	pterm = mul_fp(pid->p_gain, fp_error);
+
+	pid->integral += fp_error;
+
+	/* limit the integral term */
+	integral_limit = int_tofp(30);
+	if (pid->integral > integral_limit)
+		pid->integral = integral_limit;
+	if (pid->integral < -integral_limit)
+		pid->integral = -integral_limit;
+
+	dterm = mul_fp(pid->d_gain, (err - pid->last_err));
+	pid->last_err = err;
+
+	result = pterm + mul_fp(pid->integral, pid->i_gain) + dterm;
+
+	return (signed int)fp_toint(result);
+}
+
+
+static inline void intel_pstate_busy_pid_reset(struct cpudata *cpu)
+{
+	pid_p_gain_set(&cpu->pid, cpu->pstate_policy->p_gain_pct);
+	pid_d_gain_set(&cpu->pid, cpu->pstate_policy->d_gain_pct);
+	pid_i_gain_set(&cpu->pid, cpu->pstate_policy->i_gain_pct);
+
+	pid_reset(&cpu->pid,
+		cpu->pstate_policy->setpoint,
+		100,
+		cpu->pstate_policy->deadband,
+		0);
+
+}
+
+static inline void intel_pstate_idle_pid_reset(struct cpudata *cpu)
+{
+	pid_p_gain_set(&cpu->idle_pid, cpu->pstate_policy->p_gain_pct);
+	pid_d_gain_set(&cpu->idle_pid, cpu->pstate_policy->d_gain_pct);
+	pid_i_gain_set(&cpu->idle_pid, cpu->pstate_policy->i_gain_pct);
+
+	pid_reset(&cpu->idle_pid,
+		75,
+		50,
+		cpu->pstate_policy->deadband,
+		0);
+}
+
+static inline void intel_pstate_reset_all_pid(void)
+{
+	unsigned int cpu;
+	for_each_online_cpu(cpu) {
+		if (all_cpu_data[cpu])
+			intel_pstate_busy_pid_reset(all_cpu_data[cpu]);
+	}
+}
+
+/************************** debugfs begin ************************/
+static int pid_param_set(void *data, u64 val)
+{
+	*(u32 *)data = val;
+	intel_pstate_reset_all_pid();
+	return 0;
+}
+static int pid_param_get(void *data, u64 *val)
+{
+	*val = *(u32 *)data;
+	return 0;
+}
+DEFINE_SIMPLE_ATTRIBUTE(fops_pid_param, pid_param_get,
+			pid_param_set, "%llu\n");
+
+struct pid_param {
+	char *name;
+	void *value;
+};
+
+static struct pid_param pid_files[] = {
+	{"sample_rate_ms", &default_policy.sample_rate_ms},
+	{"d_gain_pct", &default_policy.d_gain_pct},
+	{"i_gain_pct", &default_policy.i_gain_pct},
+	{"deadband", &default_policy.deadband},
+	{"setpoint", &default_policy.setpoint},
+	{"p_gain_pct", &default_policy.p_gain_pct},
+	{NULL, NULL}
+};
+
+struct dentry *debugfs_parent;
+void intel_pstate_debug_expose_params(void)
+{
+	int i = 0;
+
+	debugfs_parent = debugfs_create_dir("pstate_snb", NULL);
+	if (IS_ERR_OR_NULL(debugfs_parent))
+		return;
+	while (pid_files[i].name) {
+		debugfs_create_file(pid_files[i].name, 0660,
+				debugfs_parent, pid_files[i].value,
+				&fops_pid_param);
+		i++;
+	}
+
+}
+
+/************************** debugfs end ************************/
+
+/************************** sysfs begin ************************/
+#define show_one(file_name, object)					\
+	static ssize_t show_##file_name					\
+	(struct kobject *kobj, struct attribute *attr, char *buf)	\
+	{								\
+		return sprintf(buf, "%u\n", limits.object);	\
+	}
+
+static ssize_t store_no_turbo(struct kobject *a, struct attribute *b,
+				const char *buf, size_t count)
+{
+	unsigned int input;
+	int ret;
+	ret = sscanf(buf, "%u", &input);
+	if (ret != 1)
+		return -EINVAL;
+	limits.no_turbo =  clamp_t(int, input, 0 , 1);
+
+	return count;
+}
+
+static ssize_t store_max_perf_pct(struct kobject *a, struct attribute *b,
+				const char *buf, size_t count)
+{
+	unsigned int input;
+	int ret;
+	ret = sscanf(buf, "%u", &input);
+	if (ret != 1)
+		return -EINVAL;
+
+	limits.max_perf_pct = clamp_t(int, input, 0 , 100);
+	limits.max_perf = div_fp(int_tofp(limits.max_perf_pct), int_tofp(100));
+	return count;
+}
+
+static ssize_t store_min_perf_pct(struct kobject *a, struct attribute *b,
+				const char *buf, size_t count)
+{
+	unsigned int input;
+	int ret;
+	ret = sscanf(buf, "%u", &input);
+	if (ret != 1)
+		return -EINVAL;
+	limits.min_perf_pct = clamp_t(int, input, 0 , 100);
+	limits.min_perf = div_fp(int_tofp(limits.min_perf_pct), int_tofp(100));
+
+	return count;
+}
+
+show_one(no_turbo, no_turbo);
+show_one(max_perf_pct, max_perf_pct);
+show_one(min_perf_pct, min_perf_pct);
+
+
+define_one_global_rw(no_turbo);
+define_one_global_rw(max_perf_pct);
+define_one_global_rw(min_perf_pct);
+
+
+static struct attribute *intel_pstate_attributes[] = {
+	&no_turbo.attr,
+	&max_perf_pct.attr,
+	&min_perf_pct.attr,
+	NULL
+};
+
+static struct attribute_group intel_pstate_attr_group = {
+	.attrs = intel_pstate_attributes,
+};
+static struct kobject *intel_pstate_kobject;
+
+void intel_pstate_sysfs_expose_params(void)
+{
+	int rc;
+
+	intel_pstate_kobject = kobject_create_and_add("intel_pstate",
+						&cpu_subsys.dev_root->kobj);
+	BUG_ON(!intel_pstate_kobject);
+	rc = sysfs_create_group(intel_pstate_kobject,
+				&intel_pstate_attr_group);
+	BUG_ON(rc);
+}
+
+
+/************************** sysfs end ************************/
+
+static int intel_pstate_min_pstate(void)
+{
+	u64 value;
+	rdmsrl(0xCE, value);
+	return (value >> 40) & 0xFF;
+}
+
+static int intel_pstate_max_pstate(void)
+{
+	u64 value;
+	rdmsrl(0xCE, value);
+	return (value >> 8) & 0xFF;
+}
+
+static int intel_pstate_turbo_pstate(void)
+{
+	u64 value;
+	int nont, ret;
+	rdmsrl(0x1AD, value);
+	nont = intel_pstate_max_pstate();
+	ret = ((value) & 255);
+	if (ret <= nont)
+		ret = nont;
+	return ret;
+}
+
+static void intel_pstate_get_min_max(struct cpudata *cpu, int *min, int *max)
+{
+	int max_perf = cpu->pstate.turbo_pstate;
+	int min_perf;
+	if (limits.no_turbo)
+		max_perf = cpu->pstate.max_pstate;
+
+	max_perf = fp_toint(mul_fp(int_tofp(max_perf), limits.max_perf));
+	*max = clamp_t(int, max_perf,
+			cpu->pstate.min_pstate, cpu->pstate.turbo_pstate);
+
+	min_perf = fp_toint(mul_fp(int_tofp(max_perf), limits.min_perf));
+	*min = clamp_t(int, min_perf,
+			cpu->pstate.min_pstate, max_perf);
+}
+
+
+static void intel_pstate_set_pstate(struct cpudata *cpu, int pstate)
+{
+	int max_perf, min_perf;
+
+	intel_pstate_get_min_max(cpu, &min_perf, &max_perf);
+
+	pstate = clamp_t(int, pstate, min_perf, max_perf);
+
+	if (pstate == cpu->pstate.current_pstate)
+		return;
+
+
+#ifndef MODULE
+	trace_cpu_frequency(pstate * 100000, cpu->cpu);
+#endif
+	cpu->pstate.current_pstate = pstate;
+	wrmsrl(MSR_IA32_PERF_CTL, pstate << 8);
+
+}
+
+static inline void intel_pstate_pstate_increase(struct cpudata *cpu, int steps)
+{
+	int target;
+	target = cpu->pstate.current_pstate + steps;
+
+	intel_pstate_set_pstate(cpu, target);
+}
+
+static inline void intel_pstate_pstate_decrease(struct cpudata *cpu, int steps)
+{
+	int target;
+	target = cpu->pstate.current_pstate - steps;
+	intel_pstate_set_pstate(cpu, target);
+}
+
+static void intel_pstate_get_cpu_pstates(struct cpudata *cpu)
+{
+	sprintf(cpu->name, "Intel 2nd generation core");
+
+	cpu->pstate.min_pstate = intel_pstate_min_pstate();
+	cpu->pstate.max_pstate = intel_pstate_max_pstate();
+	cpu->pstate.turbo_pstate = intel_pstate_turbo_pstate();
+
+	/* goto max pstate so we don't slow up boot if we are built-in
+	   if we are a module we will take care of it during normal
+	   operation
+	*/
+	intel_pstate_set_pstate(cpu, cpu->pstate.max_pstate);
+}
+
+
+static inline void intel_pstate_calc_busy(struct cpudata *cpu,
+					struct sample *sample)
+{
+	u64 core_pct;
+	sample->pstate_pct_busy = 100 - div64_u64(
+					sample->idletime_us * 100,
+					sample->duration_us);
+	core_pct = div64_u64(sample->aperf * 100, sample->mperf);
+	sample->freq = cpu->pstate.turbo_pstate * core_pct * 1000;
+
+	sample->core_pct_busy = sample->pstate_pct_busy * core_pct / 100;
+}
+
+static inline int intel_pstate_sample(struct cpudata *cpu)
+{
+	ktime_t now;
+	u64 idle_time_us;
+	u64 aperf, mperf;
+
+	now = ktime_get();
+	idle_time_us = get_cpu_idle_time_us(cpu->cpu, NULL);
+
+	rdmsrl(MSR_IA32_APERF, aperf);
+	rdmsrl(MSR_IA32_MPERF, mperf);
+	/* for the first sample, don't actually record a sample, just
+	 * set the baseline */
+	if (cpu->prev_idle_time_us > 0) {
+		cpu->sample_ptr = (cpu->sample_ptr + 1) % SAMPLE_COUNT;
+		cpu->samples[cpu->sample_ptr].start_time = cpu->prev_sample;
+		cpu->samples[cpu->sample_ptr].end_time = now;
+		cpu->samples[cpu->sample_ptr].duration_us =
+			ktime_us_delta(now, cpu->prev_sample);
+		cpu->samples[cpu->sample_ptr].idletime_us =
+			idle_time_us - cpu->prev_idle_time_us;
+
+		cpu->samples[cpu->sample_ptr].aperf = aperf;
+		cpu->samples[cpu->sample_ptr].mperf = mperf;
+		cpu->samples[cpu->sample_ptr].aperf -= cpu->prev_aperf;
+		cpu->samples[cpu->sample_ptr].mperf -= cpu->prev_mperf;
+
+		intel_pstate_calc_busy(cpu, &cpu->samples[cpu->sample_ptr]);
+	}
+
+	cpu->prev_sample = now;
+	cpu->prev_idle_time_us = idle_time_us;
+	cpu->prev_aperf = aperf;
+	cpu->prev_mperf = mperf;
+	return cpu->sample_ptr;
+}
+
+static inline void intel_pstate_set_sample_time(struct cpudata *cpu)
+{
+	int sample_time;
+	int delay;
+
+	sample_time = cpu->pstate_policy->sample_rate_ms;
+	delay = msecs_to_jiffies(sample_time);
+	delay -= jiffies % delay;
+	mod_timer_pinned(&cpu->timer, jiffies + delay);
+}
+
+static inline void intel_pstate_idle_mode(struct cpudata *cpu)
+{
+	cpu->idle_mode = 1;
+}
+
+static inline void intel_pstate_normal_mode(struct cpudata *cpu)
+{
+	cpu->idle_mode = 0;
+}
+
+static inline int intel_pstate_get_scaled_busy(struct cpudata *cpu)
+{
+	int32_t busy_scaled;
+	int32_t core_busy, turbo_pstate, current_pstate;
+
+	core_busy    = int_tofp(cpu->samples[cpu->sample_ptr].core_pct_busy);
+	turbo_pstate   = int_tofp(cpu->pstate.turbo_pstate);
+	current_pstate = int_tofp(cpu->pstate.current_pstate);
+	busy_scaled = mul_fp(core_busy, div_fp(turbo_pstate, current_pstate));
+
+
+	return fp_toint(busy_scaled);
+}
+
+static inline void intel_pstate_adjust_busy_pstate(struct cpudata *cpu)
+{
+	int busy_scaled;
+	struct _pid *pid;
+	signed int ctl = 0;
+	int steps;
+
+	pid = &cpu->pid;
+	busy_scaled =  intel_pstate_get_scaled_busy(cpu);
+
+	ctl = pid_calc(pid, busy_scaled);
+
+	steps = abs(ctl);
+	if (ctl < 0)
+		intel_pstate_pstate_increase(cpu, steps);
+	else
+		intel_pstate_pstate_decrease(cpu, steps);
+}
+
+static inline void intel_pstate_adjust_idle_pstate(struct cpudata *cpu)
+{
+	int busy_scaled;
+	struct _pid *pid;
+	int ctl = 0;
+	int steps;
+
+	pid = &cpu->idle_pid;
+
+	busy_scaled =  intel_pstate_get_scaled_busy(cpu);
+
+	ctl = pid_calc(pid, 100 - busy_scaled);
+
+	steps = abs(ctl);
+	if (ctl < 0)
+		intel_pstate_pstate_decrease(cpu, steps);
+	else
+		intel_pstate_pstate_increase(cpu, steps);
+
+	if (cpu->pstate.current_pstate == cpu->pstate.min_pstate)
+		intel_pstate_normal_mode(cpu);
+}
+
+static void intel_pstate_timer_func(unsigned long __data)
+{
+	struct cpudata *cpu = (struct cpudata *) __data;
+	int idx;
+
+
+	idx = intel_pstate_sample(cpu);
+
+
+	if (!cpu->idle_mode)
+		intel_pstate_adjust_busy_pstate(cpu);
+	else
+		intel_pstate_adjust_idle_pstate(cpu);
+
+#if defined(XPERF_FIX)
+	if (cpu->pstate.current_pstate == cpu->pstate.min_pstate) {
+		cpu->min_pstate_count++;
+		if (!(cpu->min_pstate_count % 5)) {
+			intel_pstate_set_pstate(cpu, cpu->pstate.max_pstate);
+			intel_pstate_idle_mode(cpu);
+		}
+	} else
+		cpu->min_pstate_count = 0;
+#endif
+	intel_pstate_set_sample_time(cpu);
+}
+
+#define ICPU(model, policy) \
+	{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long)&policy }
+
+static const struct x86_cpu_id intel_pstate_cpu_ids[] = {
+	ICPU(0x2a, default_policy),
+	ICPU(0x2d, default_policy),
+	{}
+};
+MODULE_DEVICE_TABLE(x86cpu, intel_pstate_cpu_ids);
+
+static int intel_pstate_init_cpu(unsigned int cpunum)
+{
+
+	const struct x86_cpu_id *id;
+	struct cpudata *cpu;
+
+	id = x86_match_cpu(intel_pstate_cpu_ids);
+	if (!id)
+		return -ENODEV;
+
+	all_cpu_data[cpunum] = kzalloc(sizeof(struct cpudata), GFP_KERNEL);
+	if (!all_cpu_data[cpunum])
+		return -ENOMEM;
+
+	cpu = all_cpu_data[cpunum];
+
+	intel_pstate_get_cpu_pstates(cpu);
+
+	cpu->cpu = cpunum;
+	cpu->pstate_policy =
+		(struct pstate_adjust_policy *)id->driver_data;
+	init_timer_deferrable(&cpu->timer);
+	cpu->timer.function = intel_pstate_timer_func;
+	cpu->timer.data =
+		(unsigned long)cpu;
+	cpu->timer.expires = jiffies + HZ/100;
+	intel_pstate_busy_pid_reset(cpu);
+	intel_pstate_idle_pid_reset(cpu);
+	intel_pstate_sample(cpu);
+	intel_pstate_set_pstate(cpu, cpu->pstate.max_pstate);
+
+	add_timer_on(&cpu->timer, cpunum);
+
+	pr_info("Intel pstate controlling:  cpu %d\n", cpunum);
+
+	return 0;
+}
+
+unsigned int intel_pstate_get(unsigned int cpu_num)
+{
+	struct sample *sample;
+	struct cpudata *cpu;
+
+	cpu = all_cpu_data[cpu_num];
+	if (!cpu)
+		return 0;
+	sample = &cpu->samples[cpu->sample_ptr];
+	return sample->freq;
+}
+
+static int intel_pstate_set_policy(struct cpufreq_policy *policy)
+{
+	struct cpudata *cpu;
+	int min, max;
+
+	cpu = all_cpu_data[policy->cpu];
+
+	intel_pstate_get_min_max(cpu, &min, &max);
+
+	limits.min_perf_pct = (policy->min * 100) / policy->cpuinfo.max_freq;
+	limits.min_perf_pct = clamp_t(int, limits.min_perf_pct, 0 , 100);
+	limits.min_perf = div_fp(int_tofp(limits.min_perf_pct), int_tofp(100));
+
+
+	limits.max_perf_pct = policy->max * 100 / policy->cpuinfo.max_freq;
+	limits.max_perf_pct = clamp_t(int, limits.max_perf_pct, 0 , 100);
+	limits.max_perf = div_fp(int_tofp(limits.max_perf_pct), int_tofp(100));
+
+
+
+	if (policy->policy == CPUFREQ_POLICY_PERFORMANCE) {
+		limits.min_perf_pct = 100;
+		limits.min_perf = int_tofp(1);
+		limits.max_perf_pct = 100;
+		limits.max_perf = int_tofp(1);
+		limits.no_turbo = 0;
+	}
+
+	return 0;
+}
+
+static int intel_pstate_verify_policy(struct cpufreq_policy *policy)
+{
+	cpufreq_verify_within_limits(policy,
+				policy->cpuinfo.min_freq,
+				policy->cpuinfo.max_freq);
+
+	if ((policy->policy != CPUFREQ_POLICY_POWERSAVE) &&
+		(policy->policy != CPUFREQ_POLICY_PERFORMANCE))
+		return -EINVAL;
+
+	return 0;
+}
+
+static int __cpuinit intel_pstate_cpu_exit(struct cpufreq_policy *policy)
+{
+	int cpu = policy->cpu;
+
+	del_timer(&all_cpu_data[cpu]->timer);
+	kfree(all_cpu_data[cpu]);
+	all_cpu_data[cpu] = NULL;
+	return 0;
+}
+
+static int __cpuinit intel_pstate_cpu_init(struct cpufreq_policy *policy)
+{
+	int rc, min_pstate, max_pstate;
+	struct cpudata *cpu;
+
+	rc = intel_pstate_init_cpu(policy->cpu);
+	if (rc)
+		return rc;
+
+	cpu = all_cpu_data[policy->cpu];
+
+	if (!limits.no_turbo &&
+		limits.min_perf_pct == 100 && limits.max_perf_pct == 100)
+		policy->policy = CPUFREQ_POLICY_PERFORMANCE;
+	else
+		policy->policy = CPUFREQ_POLICY_POWERSAVE;
+
+	intel_pstate_get_min_max(cpu, &min_pstate, &max_pstate);
+	policy->min = min_pstate * 100000;
+	policy->max = max_pstate * 100000;
+
+	/* cpuinfo and default policy values */
+	policy->cpuinfo.min_freq = cpu->pstate.min_pstate * 100000;
+	policy->cpuinfo.max_freq = cpu->pstate.turbo_pstate * 100000;
+	policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
+	cpumask_set_cpu(policy->cpu, policy->cpus);
+
+	return 0;
+}
+
+static struct cpufreq_driver intel_pstate_driver = {
+	.flags		= CPUFREQ_CONST_LOOPS,
+	.verify		= intel_pstate_verify_policy,
+	.setpolicy	= intel_pstate_set_policy,
+	.get		= intel_pstate_get,
+	.init		= intel_pstate_cpu_init,
+	.exit		= intel_pstate_cpu_exit,
+	.name		= "intel_pstate",
+	.owner		= THIS_MODULE,
+};
+
+
+static void intel_pstate_exit(void)
+{
+	int cpu;
+
+
+	sysfs_remove_group(intel_pstate_kobject,
+				&intel_pstate_attr_group);
+	debugfs_remove_recursive(debugfs_parent);
+
+	cpufreq_unregister_driver(&intel_pstate_driver);
+
+	if (!all_cpu_data)
+		return;
+
+	get_online_cpus();
+	for_each_online_cpu(cpu) {
+		if (all_cpu_data[cpu]) {
+			del_timer_sync(&all_cpu_data[cpu]->timer);
+			kfree(all_cpu_data[cpu]);
+		}
+	}
+
+	put_online_cpus();
+	vfree(all_cpu_data);
+}
+
+static int __init intel_pstate_init(void)
+{
+	int rc = 0;
+
+	const struct x86_cpu_id *id;
+
+	id = x86_match_cpu(intel_pstate_cpu_ids);
+	if (!id)
+		return -ENODEV;
+
+	pr_info("Intel P-state driver initializing.\n");
+
+	all_cpu_data = vmalloc(sizeof(void *) * num_possible_cpus());
+	if (!all_cpu_data)
+		return -ENOMEM;
+	memset(all_cpu_data, 0, sizeof(void *) * num_possible_cpus());
+
+	rc = cpufreq_register_driver(&intel_pstate_driver);
+	if (rc)
+		goto out;
+
+	intel_pstate_debug_expose_params();
+	intel_pstate_sysfs_expose_params();
+	return rc;
+out:
+	intel_pstate_exit();
+	return -ENODEV;
+}
+
+MODULE_AUTHOR("Dirk Brandewie <dirk.j.brandewie@intel.com>");
+MODULE_DESCRIPTION("'intel_pstate' - P state driver Intel Core processors");
+MODULE_LICENSE("GPL");
+
+device_initcall(intel_pstate_init);
+module_exit(intel_pstate_exit);
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/7] cpufreq: Retrieve current frequency from scaling drivers with internal governors
  2013-02-05 18:24 ` [PATCH 2/7] cpufreq: Retrieve current frequency from scaling drivers with internal governors dirk.brandewie
@ 2013-02-06  1:41   ` Viresh Kumar
  2013-02-06  1:45   ` Viresh Kumar
  1 sibling, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2013-02-06  1:41 UTC (permalink / raw)
  To: dirk.brandewie; +Cc: linux-kernel, cpufreq, Dirk Brandewie

On Tue, Feb 5, 2013 at 11:54 PM,  <dirk.brandewie@gmail.com> wrote:
> From: Dirk Brandewie <dirk.brandewie@gmail.com>
>
> Scaling drivers that implement the cpufreq_driver.setpolicy() versus
> the cpufreq_driver.target() interface do not set policy->cur.

> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>

> From: Dirk Brandewie <dirk.brandewie@gmail.com>
> Date: 2013-02-04 18:44:40 -0800
>
> cpufreq: balance out cpufreq_cpu_{get,put} for scaling drivers using setpolicy

> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>

Wow!! Something happened here. :)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/7] cpufreq: Don't remove sysfs link for policy->cpu
  2013-02-05 18:24 ` [PATCH 1/7] cpufreq: Don't remove sysfs link for policy->cpu dirk.brandewie
@ 2013-02-06  1:42   ` Viresh Kumar
  0 siblings, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2013-02-06  1:42 UTC (permalink / raw)
  To: dirk.brandewie; +Cc: linux-kernel, cpufreq

On Tue, Feb 5, 2013 at 11:54 PM,  <dirk.brandewie@gmail.com> wrote:
> From: Viresh Kumar <viresh.kumar@linaro.org>
>
> "cpufreq" directory in policy->cpu is never created using sysfs_create_link(),
> but using kobject_init_and_add(). And so we shouldn't call sysfs_remove_link()
> for policy->cpu(). sysfs stuff for policy->cpu is automatically removed when we
> call kobject_put() for dying policy.
>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> Tested-by: Dirk Brandewie <dirk.brandewie@gmail.com>

Its already included by Rafael, don't include it in your next version:

commit 73bf0fc2b03d1f4fdada0ec430dc20bfb089cfd5
Author: Viresh Kumar <viresh.kumar@linaro.org>
Date:   Tue Feb 5 22:21:14 2013 +0100

    cpufreq: Don't remove sysfs link for policy->cpu

    "cpufreq" directory in policy->cpu is never created using
    sysfs_create_link(), but using kobject_init_and_add(). And so we
    shouldn't call sysfs_remove_link() for policy->cpu().  sysfs stuff
    for policy->cpu is automatically removed when we call kobject_put()
    for dying policy.

    Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
    Tested-by: Dirk Brandewie <dirk.brandewie@gmail.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/7] cpufreq: Retrieve current frequency from scaling drivers with internal governors
  2013-02-05 18:24 ` [PATCH 2/7] cpufreq: Retrieve current frequency from scaling drivers with internal governors dirk.brandewie
  2013-02-06  1:41   ` Viresh Kumar
@ 2013-02-06  1:45   ` Viresh Kumar
  2013-02-06  2:15     ` Dirk Brandewie
  1 sibling, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2013-02-06  1:45 UTC (permalink / raw)
  To: dirk.brandewie; +Cc: linux-kernel, cpufreq, Dirk Brandewie

On Tue, Feb 5, 2013 at 11:54 PM,  <dirk.brandewie@gmail.com> wrote:
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 2817c3c..96bc302 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1181,6 +1181,13 @@ unsigned int cpufreq_quick_get(unsigned int cpu)
>         struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
>         unsigned int ret_freq = 0;
>
> +       if (cpufreq_driver && cpufreq_driver->setpolicy &&
> +               cpufreq_driver->get) {
> +               ret_freq = cpufreq_driver->get(cpu);
> +               cpufreq_cpu_put(policy);
> +               return ret_freq;
> +       }
> +

This was my comment on the last version:

"You are required to do cpufreq_cpu_put() in this case too... Better do
cpufreq_cpu_get() after your check."

It still applies.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/7] cpufreq: Only query drivers that implement cpufreq_driver.target()
  2013-02-05 18:24 ` [PATCH 3/7] cpufreq: Only query drivers that implement cpufreq_driver.target() dirk.brandewie
@ 2013-02-06  1:47   ` Viresh Kumar
  2013-02-06  2:06     ` Dirk Brandewie
  0 siblings, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2013-02-06  1:47 UTC (permalink / raw)
  To: dirk.brandewie; +Cc: linux-kernel, cpufreq, Dirk Brandewie

On Tue, Feb 5, 2013 at 11:54 PM,  <dirk.brandewie@gmail.com> wrote:
> From: Dirk Brandewie <dirk.brandewie@gmail.com>
>
> Scaling drivers that implement cpufreq_driver.setpolicy() have
> internal governors and may/will change the current operating frequency
> very frequently this will cause cpufreq_out_of_sync() to be called
> every time. Only call cpufreq_driver.get() for drivers that implement
> cpufreq_driver.target()
>
> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
> ---
>  drivers/cpufreq/cpufreq.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 96bc302..d8daa4b 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1787,7 +1787,7 @@ int cpufreq_update_policy(unsigned int cpu)
>
>         /* BIOS might change freq behind our back
>           -> ask driver for current freq and notify governors about a change */
> -       if (cpufreq_driver->get) {
> +       if (cpufreq_driver->get && cpufreq_driver->target) {
>                 policy.cur = cpufreq_driver->get(cpu);
>                 if (!data->cur) {
>                         pr_debug("Driver did not initialize current freq");

I am really not liking copy-pasting my older comments here :(

"This would mean policy->cur has a garbage value. I don't really know
how would other routine behave on this. Would it make sense to make
policy->cur zero atleast?
"

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/7] cpufreq: Do not track governor name for scaling drivers with internal governors.
  2013-02-05 18:24 ` [PATCH 4/7] cpufreq: Do not track governor name for scaling drivers with internal governors dirk.brandewie
@ 2013-02-06  1:50   ` Viresh Kumar
  0 siblings, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2013-02-06  1:50 UTC (permalink / raw)
  To: dirk.brandewie; +Cc: linux-kernel, cpufreq, Dirk Brandewie

On Tue, Feb 5, 2013 at 11:54 PM,  <dirk.brandewie@gmail.com> wrote:
> From: Dirk Brandewie <dirk.brandewie@gmail.com>
>
> Scaling drivers that implement internal governors do not have governor
> structures assocaited with them.  Only track the name of the governor
> associated with the CPU if the driver does not implement
> cpufreq_driver.setpolicy()
>
> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
> ---
>  drivers/cpufreq/cpufreq.c |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index d8daa4b..622e282 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1050,7 +1050,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
>                 __cpufreq_governor(data, CPUFREQ_GOV_STOP);
>
>  #ifdef CONFIG_HOTPLUG_CPU
> -       strncpy(per_cpu(cpufreq_cpu_governor, cpu), data->governor->name,
> +       if (!cpufreq_driver->setpolicy)
> +               strncpy(per_cpu(cpufreq_cpu_governor, cpu),
> +                       data->governor->name,
>                         CPUFREQ_NAME_LEN);

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/7] cpufreq: balance out cpufreq_cpu_{get,put} for scaling drivers using setpolicy
  2013-02-05 18:24 ` [PATCH 5/7] cpufreq: balance out cpufreq_cpu_{get,put} for scaling drivers using setpolicy dirk.brandewie
@ 2013-02-06  1:58   ` Viresh Kumar
  2013-02-06  2:08     ` Dirk Brandewie
  0 siblings, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2013-02-06  1:58 UTC (permalink / raw)
  To: dirk.brandewie; +Cc: linux-kernel, cpufreq, Dirk Brandewie

On Tue, Feb 5, 2013 at 11:54 PM,  <dirk.brandewie@gmail.com> wrote:
> From: Dirk Brandewie <dirk.brandewie@gmail.com>
>
> There is an additional reference added to the driver in
> cpufreq_add_dev()  that is removed in__cpufreq_governor() if the
> driver implements target().  Remove the last reference when the
> driver implements setpolicy()
>
> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
> ---
>  drivers/cpufreq/cpufreq.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 622e282..d17477b 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1049,6 +1049,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
>         if (cpufreq_driver->target)
>                 __cpufreq_governor(data, CPUFREQ_GOV_STOP);
>
> +       if (cpufreq_driver->setpolicy)
> +               cpufreq_cpu_put(data);

I don't understand this patch at all.. I grepped both cpufreq_cpu_get() & put()
in bleeding-edge and found everything to be correct.

Can you please point me to the exact line numbers ?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/7] cpufreq: Only query drivers that implement cpufreq_driver.target()
  2013-02-06  1:47   ` Viresh Kumar
@ 2013-02-06  2:06     ` Dirk Brandewie
  2013-02-06  2:43       ` Viresh Kumar
  0 siblings, 1 reply; 25+ messages in thread
From: Dirk Brandewie @ 2013-02-06  2:06 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: dirk.brandewie, linux-kernel, cpufreq, Dirk Brandewie

On 02/05/2013 05:47 PM, Viresh Kumar wrote:
> On Tue, Feb 5, 2013 at 11:54 PM,  <dirk.brandewie@gmail.com> wrote:
>> From: Dirk Brandewie <dirk.brandewie@gmail.com>
>>
>> Scaling drivers that implement cpufreq_driver.setpolicy() have
>> internal governors and may/will change the current operating frequency
>> very frequently this will cause cpufreq_out_of_sync() to be called
>> every time. Only call cpufreq_driver.get() for drivers that implement
>> cpufreq_driver.target()
>>
>> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
>> ---
>>   drivers/cpufreq/cpufreq.c |    2 +-
>>   1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 96bc302..d8daa4b 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -1787,7 +1787,7 @@ int cpufreq_update_policy(unsigned int cpu)
>>
>>          /* BIOS might change freq behind our back
>>            -> ask driver for current freq and notify governors about a change */
>> -       if (cpufreq_driver->get) {
>> +       if (cpufreq_driver->get && cpufreq_driver->target) {
>>                  policy.cur = cpufreq_driver->get(cpu);
>>                  if (!data->cur) {
>>                          pr_debug("Driver did not initialize current freq");
>
> I am really not liking copy-pasting my older comments here :(
>
> "This would mean policy->cur has a garbage value. I don't really know
> how would other routine behave on this. Would it make sense to make
> policy->cur zero atleast?
> "
>
The driver implements get() and will return a valid value but the other 
components that track the current frequency will not have been notified about 
any change so there is nothing to be out of sync with.  There is no reason to 
call cpufreq_out_of_sync() where the driver being used implements an internal
governor.

--Dirk


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/7] cpufreq: balance out cpufreq_cpu_{get,put} for scaling drivers using setpolicy
  2013-02-06  1:58   ` Viresh Kumar
@ 2013-02-06  2:08     ` Dirk Brandewie
  2013-02-06  2:45       ` Viresh Kumar
  0 siblings, 1 reply; 25+ messages in thread
From: Dirk Brandewie @ 2013-02-06  2:08 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: dirk.brandewie, linux-kernel, cpufreq, Dirk Brandewie

On 02/05/2013 05:58 PM, Viresh Kumar wrote:
> On Tue, Feb 5, 2013 at 11:54 PM,  <dirk.brandewie@gmail.com> wrote:
>> From: Dirk Brandewie <dirk.brandewie@gmail.com>
>>
>> There is an additional reference added to the driver in
>> cpufreq_add_dev()  that is removed in__cpufreq_governor() if the
>> driver implements target().  Remove the last reference when the
>> driver implements setpolicy()
>>
>> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
>> ---
>>   drivers/cpufreq/cpufreq.c |    3 +++
>>   1 files changed, 3 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 622e282..d17477b 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -1049,6 +1049,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
>>          if (cpufreq_driver->target)
>>                  __cpufreq_governor(data, CPUFREQ_GOV_STOP);
>>
>> +       if (cpufreq_driver->setpolicy)
>> +               cpufreq_cpu_put(data);
>
> I don't understand this patch at all.. I grepped both cpufreq_cpu_get() & put()
> in bleeding-edge and found everything to be correct.
>
> Can you please point me to the exact line numbers ?
>

Line 878 in cpufreq_add_dev()

--Dirk

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/7] cpufreq: Retrieve current frequency from scaling drivers with internal governors
  2013-02-06  1:45   ` Viresh Kumar
@ 2013-02-06  2:15     ` Dirk Brandewie
  2013-02-06  2:25       ` Viresh Kumar
  0 siblings, 1 reply; 25+ messages in thread
From: Dirk Brandewie @ 2013-02-06  2:15 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: dirk.brandewie, linux-kernel, cpufreq, Dirk Brandewie

On 02/05/2013 05:45 PM, Viresh Kumar wrote:
> On Tue, Feb 5, 2013 at 11:54 PM,  <dirk.brandewie@gmail.com> wrote:
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 2817c3c..96bc302 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -1181,6 +1181,13 @@ unsigned int cpufreq_quick_get(unsigned int cpu)
>>          struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
>>          unsigned int ret_freq = 0;
>>
>> +       if (cpufreq_driver && cpufreq_driver->setpolicy &&
>> +               cpufreq_driver->get) {
>> +               ret_freq = cpufreq_driver->get(cpu);
>> +               cpufreq_cpu_put(policy);
>> +               return ret_freq;
>> +       }
>> +
>
How about this?

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 2817c3c..9c0eac4 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1182,7 +1182,12 @@ unsigned int cpufreq_quick_get(unsigned int cpu)
  	unsigned int ret_freq = 0;

  	if (policy) {
-		ret_freq = policy->cur;
+		if (cpufreq_driver && cpufreq_driver->setpolicy &&
+			cpufreq_driver->get) {
+			ret_freq = cpufreq_driver->get(cpu);
+		} else {
+			ret_freq = policy->cur;
+		}
  		cpufreq_cpu_put(policy);
  	}



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/7] cpufreq_stats: do not remove sysfs files if frequency table is not present
  2013-02-05 18:24 ` [PATCH 6/7] cpufreq_stats: do not remove sysfs files if frequency table is not present dirk.brandewie
@ 2013-02-06  2:18   ` Viresh Kumar
  0 siblings, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2013-02-06  2:18 UTC (permalink / raw)
  To: dirk.brandewie; +Cc: linux-kernel, cpufreq, Dirk Brandewie

On Tue, Feb 5, 2013 at 11:54 PM,  <dirk.brandewie@gmail.com> wrote:
> From: Dirk Brandewie <dirk.brandewie@gmail.com>
>
> The sysfs files for cpufreq_stats are created in cpufreq_stats_create_table()
> called from cpufreq_stat_notifier_policy() when a policy is added to
> the cpu. cpufreq_stats_create_table() will not be called if the
> scaling driver does not export a frequency table to cpufreq.  Use the
> same fence on tear down.
>
> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
> ---
>  drivers/cpufreq/cpufreq_stats.c |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
> index 572124c..fb001e2 100644
> --- a/drivers/cpufreq/cpufreq_stats.c
> +++ b/drivers/cpufreq/cpufreq_stats.c
> @@ -185,6 +185,11 @@ static void cpufreq_stats_free_table(unsigned int cpu)
>  static void cpufreq_stats_free_sysfs(unsigned int cpu)
>  {
>         struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> +       struct cpufreq_frequency_table *table;
> +
> +       table = cpufreq_frequency_get_table(cpu);
> +       if (!table)
> +               return;

Because we aren't using table later on, i would write it as:

> +       if (!cpufreq_frequency_get_table(cpu))
> +               return;

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/7] cpufreq: Retrieve current frequency from scaling drivers with internal governors
  2013-02-06  2:15     ` Dirk Brandewie
@ 2013-02-06  2:25       ` Viresh Kumar
  2013-02-06  2:31         ` Dirk Brandewie
  0 siblings, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2013-02-06  2:25 UTC (permalink / raw)
  To: Dirk Brandewie; +Cc: linux-kernel, cpufreq, Dirk Brandewie

On Wed, Feb 6, 2013 at 7:45 AM, Dirk Brandewie <dirk.brandewie@gmail.com> wrote:
> How about this?
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 2817c3c..9c0eac4 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1182,7 +1182,12 @@ unsigned int cpufreq_quick_get(unsigned int cpu)
>
>         unsigned int ret_freq = 0;
>
>         if (policy) {
> -               ret_freq = policy->cur;
>
> +               if (cpufreq_driver && cpufreq_driver->setpolicy &&
> +                       cpufreq_driver->get) {
> +                       ret_freq = cpufreq_driver->get(cpu);
> +               } else {
> +                       ret_freq = policy->cur;
> +               }
>                 cpufreq_cpu_put(policy);

The problem is: You don't need to do get/put in your case at all.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/7] cpufreq: Retrieve current frequency from scaling drivers with internal governors
  2013-02-06  2:25       ` Viresh Kumar
@ 2013-02-06  2:31         ` Dirk Brandewie
  2013-02-06  2:46           ` Viresh Kumar
  0 siblings, 1 reply; 25+ messages in thread
From: Dirk Brandewie @ 2013-02-06  2:31 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Dirk Brandewie, linux-kernel, cpufreq, Dirk Brandewie

On 02/05/2013 06:25 PM, Viresh Kumar wrote:
> On Wed, Feb 6, 2013 at 7:45 AM, Dirk Brandewie <dirk.brandewie@gmail.com> wrote:
>> How about this?
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 2817c3c..9c0eac4 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -1182,7 +1182,12 @@ unsigned int cpufreq_quick_get(unsigned int cpu)
>>
>>          unsigned int ret_freq = 0;
>>
>>          if (policy) {
>> -               ret_freq = policy->cur;
>>
>> +               if (cpufreq_driver && cpufreq_driver->setpolicy &&
>> +                       cpufreq_driver->get) {
>> +                       ret_freq = cpufreq_driver->get(cpu);
>> +               } else {
>> +                       ret_freq = policy->cur;
>> +               }
>>                  cpufreq_cpu_put(policy);

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 2817c3c..7516b7d 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1178,9 +1178,14 @@ static void cpufreq_out_of_sync(unsigned int cpu, 
unsigned int old_freq,
   */
  unsigned int cpufreq_quick_get(unsigned int cpu)
  {
-	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
+	struct cpufreq_policy *policy;
  	unsigned int ret_freq = 0;

+	if (cpufreq_driver && cpufreq_driver->setpolicy &&
+			cpufreq_driver->get)
+			return cpufreq_driver->get(cpu);
+
+	policy = cpufreq_cpu_get(cpu);
  	if (policy) {
  		ret_freq = policy->cur;
  		cpufreq_cpu_put(policy);



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/7] cpufreq: Only query drivers that implement cpufreq_driver.target()
  2013-02-06  2:06     ` Dirk Brandewie
@ 2013-02-06  2:43       ` Viresh Kumar
  0 siblings, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2013-02-06  2:43 UTC (permalink / raw)
  To: Dirk Brandewie; +Cc: linux-kernel, cpufreq, Dirk Brandewie

On 6 February 2013 07:36, Dirk Brandewie <dirk.brandewie@gmail.com> wrote:
> On 02/05/2013 05:47 PM, Viresh Kumar wrote:
>>
>> On Tue, Feb 5, 2013 at 11:54 PM,  <dirk.brandewie@gmail.com> wrote:
>>>
>>> From: Dirk Brandewie <dirk.brandewie@gmail.com>
>>>
>>> Scaling drivers that implement cpufreq_driver.setpolicy() have
>>> internal governors and may/will change the current operating frequency
>>> very frequently this will cause cpufreq_out_of_sync() to be called
>>> every time. Only call cpufreq_driver.get() for drivers that implement
>>> cpufreq_driver.target()
>>>
>>> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
>>> ---
>>>   drivers/cpufreq/cpufreq.c |    2 +-
>>>   1 files changed, 1 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>> index 96bc302..d8daa4b 100644
>>> --- a/drivers/cpufreq/cpufreq.c
>>> +++ b/drivers/cpufreq/cpufreq.c
>>> @@ -1787,7 +1787,7 @@ int cpufreq_update_policy(unsigned int cpu)
>>>
>>>          /* BIOS might change freq behind our back
>>>            -> ask driver for current freq and notify governors about a
>>> change */
>>> -       if (cpufreq_driver->get) {
>>> +       if (cpufreq_driver->get && cpufreq_driver->target) {
>>>                  policy.cur = cpufreq_driver->get(cpu);
>>>                  if (!data->cur) {
>>>                          pr_debug("Driver did not initialize current
>>> freq");
>>
>>
>> I am really not liking copy-pasting my older comments here :(
>>
>> "This would mean policy->cur has a garbage value. I don't really know
>> how would other routine behave on this. Would it make sense to make
>> policy->cur zero atleast?
>> "
>>
> The driver implements get() and will return a valid value but the other
> components that track the current frequency will not have been notified
> about any change so there is nothing to be out of sync with.  There is no
> reason to call cpufreq_out_of_sync() where the driver being used implements
> an internal
> governor.

Not sure if we are discussing the same issue. What i am saying is, with your
patch we aren't calling following line:

>>>                  policy.cur = cpufreq_driver->get(cpu);

and policy was a local variable. Hence policy->cur is having a garbage value.
policy->cur is used at multiple places in cpufreq.c . Please check everywhere
if this garbage value doesn't break set_policy() type systems.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/7] cpufreq: balance out cpufreq_cpu_{get,put} for scaling drivers using setpolicy
  2013-02-06  2:08     ` Dirk Brandewie
@ 2013-02-06  2:45       ` Viresh Kumar
  2013-02-06 16:11         ` Dirk Brandewie
  0 siblings, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2013-02-06  2:45 UTC (permalink / raw)
  To: Dirk Brandewie; +Cc: linux-kernel, cpufreq, Dirk Brandewie

On 6 February 2013 07:38, Dirk Brandewie <dirk.brandewie@gmail.com> wrote:
> On 02/05/2013 05:58 PM, Viresh Kumar wrote:
>>
>> On Tue, Feb 5, 2013 at 11:54 PM,  <dirk.brandewie@gmail.com> wrote:
>>>
>>> From: Dirk Brandewie <dirk.brandewie@gmail.com>
>>>
>>> There is an additional reference added to the driver in
>>> cpufreq_add_dev()  that is removed in__cpufreq_governor() if the
>>>
>>> driver implements target().  Remove the last reference when the
>>> driver implements setpolicy()
>>>
>>> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
>>>
>>> ---
>>>   drivers/cpufreq/cpufreq.c |    3 +++
>>>   1 files changed, 3 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>> index 622e282..d17477b 100644
>>> --- a/drivers/cpufreq/cpufreq.c
>>> +++ b/drivers/cpufreq/cpufreq.c
>>> @@ -1049,6 +1049,9 @@ static int __cpufreq_remove_dev(struct device *dev,
>>> struct subsys_interface *sif
>>>
>>>          if (cpufreq_driver->target)
>>>                  __cpufreq_governor(data, CPUFREQ_GOV_STOP);
>>>
>>> +       if (cpufreq_driver->setpolicy)
>>> +               cpufreq_cpu_put(data);
>>
>>
>> I don't understand this patch at all.. I grepped both cpufreq_cpu_get() &
>> put()
>> in bleeding-edge and found everything to be correct.
>>
>> Can you please point me to the exact line numbers ?
>>
>
> Line 878 in cpufreq_add_dev()

Following is line 878:

	for_each_online_cpu(sibling) {
		struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling);
		if (cp && cpumask_test_cpu(cpu, cp->related_cpus))
			return cpufreq_add_policy_cpu(cpu, sibling, dev);
	}

How is this related to your patch?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/7] cpufreq: Retrieve current frequency from scaling drivers with internal governors
  2013-02-06  2:31         ` Dirk Brandewie
@ 2013-02-06  2:46           ` Viresh Kumar
  0 siblings, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2013-02-06  2:46 UTC (permalink / raw)
  To: Dirk Brandewie; +Cc: linux-kernel, cpufreq, Dirk Brandewie

On 6 February 2013 08:01, Dirk Brandewie <dirk.brandewie@gmail.com> wrote:
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 2817c3c..7516b7d 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1178,9 +1178,14 @@ static void cpufreq_out_of_sync(unsigned int cpu,
> unsigned int old_freq,
>   */
>
>  unsigned int cpufreq_quick_get(unsigned int cpu)
>  {
> -       struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> +       struct cpufreq_policy *policy;
>
>         unsigned int ret_freq = 0;
>
> +       if (cpufreq_driver && cpufreq_driver->setpolicy &&
> +                       cpufreq_driver->get)
> +                       return cpufreq_driver->get(cpu);
> +
> +       policy = cpufreq_cpu_get(cpu);
>         if (policy) {
>                 ret_freq = policy->cur;
>                 cpufreq_cpu_put(policy);

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/7] cpufreq: balance out cpufreq_cpu_{get,put} for scaling drivers using setpolicy
  2013-02-06  2:45       ` Viresh Kumar
@ 2013-02-06 16:11         ` Dirk Brandewie
  2013-02-06 16:19           ` Viresh Kumar
  0 siblings, 1 reply; 25+ messages in thread
From: Dirk Brandewie @ 2013-02-06 16:11 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Dirk Brandewie, linux-kernel, cpufreq, Dirk Brandewie

On 02/05/2013 06:45 PM, Viresh Kumar wrote:
> On 6 February 2013 07:38, Dirk Brandewie <dirk.brandewie@gmail.com> wrote:
>> On 02/05/2013 05:58 PM, Viresh Kumar wrote:
>>>
>>> On Tue, Feb 5, 2013 at 11:54 PM,  <dirk.brandewie@gmail.com> wrote:
>>>>
>>>> From: Dirk Brandewie <dirk.brandewie@gmail.com>
>>>>
>>>> There is an additional reference added to the driver in
>>>> cpufreq_add_dev()  that is removed in__cpufreq_governor() if the
>>>>
>>>> driver implements target().  Remove the last reference when the
>>>> driver implements setpolicy()
>>>>
>>>> Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
>>>>
>>>> ---
>>>>    drivers/cpufreq/cpufreq.c |    3 +++
>>>>    1 files changed, 3 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>>> index 622e282..d17477b 100644
>>>> --- a/drivers/cpufreq/cpufreq.c
>>>> +++ b/drivers/cpufreq/cpufreq.c
>>>> @@ -1049,6 +1049,9 @@ static int __cpufreq_remove_dev(struct device *dev,
>>>> struct subsys_interface *sif
>>>>
>>>>           if (cpufreq_driver->target)
>>>>                   __cpufreq_governor(data, CPUFREQ_GOV_STOP);
>>>>
>>>> +       if (cpufreq_driver->setpolicy)
>>>> +               cpufreq_cpu_put(data);
>>>
>>>
>>> I don't understand this patch at all.. I grepped both cpufreq_cpu_get() &
>>> put()
>>> in bleeding-edge and found everything to be correct.
>>>
>>> Can you please point me to the exact line numbers ?
>>>
>>
>> Line 878 in cpufreq_add_dev()
>
> Following is line 878:
>
> 	for_each_online_cpu(sibling) {
> 		struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling);
> 		if (cp && cpumask_test_cpu(cpu, cp->related_cpus))
> 			return cpufreq_add_policy_cpu(cpu, sibling, dev);
> 	}
>
> How is this related to your patch?
>
our files are clearly out of sync :-)  The code in cpufreq_add_dev() is
#ifdef CONFIG_SMP
	/* check whether a different CPU already registered this
	 * CPU because it is in the same boat. */
	policy = cpufreq_cpu_get(cpu);
	if (unlikely(policy)) {
		cpufreq_cpu_put(policy);
		return 0;
	}

The reference added by this cpufreq_cpu_get() is finally dropped in 
__cpufreq_remove_dev() with the call to __cpufreq_governor()

	if (driver->target)
		__cpufreq_governor(data, CPUFREQ_GOV_STOP);

Without this change I hang at:
		pr_debug("waiting for dropping of refcount\n");
		wait_for_completion(cmp);

--Dirk

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/7] cpufreq: balance out cpufreq_cpu_{get,put} for scaling drivers using setpolicy
  2013-02-06 16:11         ` Dirk Brandewie
@ 2013-02-06 16:19           ` Viresh Kumar
  0 siblings, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2013-02-06 16:19 UTC (permalink / raw)
  To: Dirk Brandewie; +Cc: linux-kernel, cpufreq, Dirk Brandewie

On 6 February 2013 21:41, Dirk Brandewie <dirk.brandewie@gmail.com> wrote:
> our files are clearly out of sync :-)  The code in cpufreq_add_dev() is

I was at latest bleeding-edge then, but i still have below code in my HEAD.

> #ifdef CONFIG_SMP
>         /* check whether a different CPU already registered this
>          * CPU because it is in the same boat. */
>         policy = cpufreq_cpu_get(cpu);
>         if (unlikely(policy)) {
>                 cpufreq_cpu_put(policy);
>                 return 0;
>         }
>
> The reference added by this cpufreq_cpu_get() is finally dropped in
> __cpufreq_remove_dev() with the call to __cpufreq_governor()

__cpufreq_governor() is not calling cpufreq_cpu_put() at all and is dropping
reference for the governor only, not for policy.

>         if (driver->target)
>                 __cpufreq_governor(data, CPUFREQ_GOV_STOP);
>
> Without this change I hang at:
>                 pr_debug("waiting for dropping of refcount\n");
>                 wait_for_completion(cmp);

Below should have fixed it for you i believe.

	pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
	cpufreq_cpu_put(data);
	unlock_policy_rwsem_write(cpu);

--
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2013-02-06 16:19 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-05 18:23 [PATCH 0/7] Add P state driver for Intel Core Processors dirk.brandewie
2013-02-05 18:24 ` [PATCH 1/7] cpufreq: Don't remove sysfs link for policy->cpu dirk.brandewie
2013-02-06  1:42   ` Viresh Kumar
2013-02-05 18:24 ` [PATCH 2/7] cpufreq: Retrieve current frequency from scaling drivers with internal governors dirk.brandewie
2013-02-06  1:41   ` Viresh Kumar
2013-02-06  1:45   ` Viresh Kumar
2013-02-06  2:15     ` Dirk Brandewie
2013-02-06  2:25       ` Viresh Kumar
2013-02-06  2:31         ` Dirk Brandewie
2013-02-06  2:46           ` Viresh Kumar
2013-02-05 18:24 ` [PATCH 3/7] cpufreq: Only query drivers that implement cpufreq_driver.target() dirk.brandewie
2013-02-06  1:47   ` Viresh Kumar
2013-02-06  2:06     ` Dirk Brandewie
2013-02-06  2:43       ` Viresh Kumar
2013-02-05 18:24 ` [PATCH 4/7] cpufreq: Do not track governor name for scaling drivers with internal governors dirk.brandewie
2013-02-06  1:50   ` Viresh Kumar
2013-02-05 18:24 ` [PATCH 5/7] cpufreq: balance out cpufreq_cpu_{get,put} for scaling drivers using setpolicy dirk.brandewie
2013-02-06  1:58   ` Viresh Kumar
2013-02-06  2:08     ` Dirk Brandewie
2013-02-06  2:45       ` Viresh Kumar
2013-02-06 16:11         ` Dirk Brandewie
2013-02-06 16:19           ` Viresh Kumar
2013-02-05 18:24 ` [PATCH 6/7] cpufreq_stats: do not remove sysfs files if frequency table is not present dirk.brandewie
2013-02-06  2:18   ` Viresh Kumar
2013-02-05 18:24 ` [PATCH 7/7] cpufreq/x86: Add P-state driver for sandy bridge dirk.brandewie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).