linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq
@ 2012-09-04  8:28 Andre Przywara
  2012-09-04  8:28 ` [PATCH 1/8 v2] acpi-cpufreq: Add support for modern AMD CPUs Andre Przywara
                   ` (8 more replies)
  0 siblings, 9 replies; 14+ messages in thread
From: Andre Przywara @ 2012-09-04  8:28 UTC (permalink / raw)
  To: Rafael J. Wysocki, Thomas Renninger
  Cc: Matthew Garret, linux-pm, cpufreq, linux-kernel, Andreas Herrmann

Hi,

now the second, revised version of the patch set. I now tested loading
both drivers after each other in several combinations, after two bug
fixes this now works as expected.
I added a patch to move messages from powernow-k8 after the initialization
phase, so it remains silent if driver loading fails.

I also rearranged the patches so that the powernow-k8 feature removal patch is
now the last one and is somewhat optional. I still prefer to have it in,
but it can be dropped if needed. Then powernow-k8 will still support modern
AMD CPUs, but will emit a warning message.

Regards,
Andre

Changes from v1:
* added hints to Kconfig about CPU support
* merge documentation from separate patch into the feature patch
* add deprecation warning
* prefix acpi-cpufreq warning messages with module name
* bugfix: avoid boost init it driver registration failed
* bugfix: fix module redirect request call (was not reached before)




^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/8 v2] acpi-cpufreq: Add support for modern AMD CPUs
  2012-09-04  8:28 [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq Andre Przywara
@ 2012-09-04  8:28 ` Andre Przywara
  2012-09-04  8:28 ` [PATCH 2/8 v2] acpi-cpufreq: Add quirk to disable _PSD usage on all " Andre Przywara
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Andre Przywara @ 2012-09-04  8:28 UTC (permalink / raw)
  To: Rafael J. Wysocki, Thomas Renninger
  Cc: Matthew Garret, linux-pm, cpufreq, linux-kernel,
	Andreas Herrmann, Andre Przywara

From: Matthew Garrett <mjg@redhat.com>

The programming model for P-states on modern AMD CPUs is very similar to
that of Intel and VIA. It makes sense to consolidate this support into one
driver rather than duplicating functionality between two of them. This
patch adds support for AMDs with hardware P-state control to acpi-cpufreq.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
---
 arch/x86/include/asm/msr-index.h |  2 ++
 drivers/cpufreq/Kconfig.x86      |  3 ++-
 drivers/cpufreq/acpi-cpufreq.c   | 43 ++++++++++++++++++++++++++++++++++------
 3 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 957ec87..1e1f3eb 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -248,6 +248,8 @@
 
 #define MSR_IA32_PERF_STATUS		0x00000198
 #define MSR_IA32_PERF_CTL		0x00000199
+#define MSR_AMD_PERF_STATUS		0xc0010063
+#define MSR_AMD_PERF_CTL		0xc0010062
 
 #define MSR_IA32_MPERF			0x000000e7
 #define MSR_IA32_APERF			0x000000e8
diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86
index 78ff7ee..8d12e37 100644
--- a/drivers/cpufreq/Kconfig.x86
+++ b/drivers/cpufreq/Kconfig.x86
@@ -23,7 +23,8 @@ config X86_ACPI_CPUFREQ
 	help
 	  This driver adds a CPUFreq driver which utilizes the ACPI
 	  Processor Performance States.
-	  This driver also supports Intel Enhanced Speedstep.
+	  This driver also supports Intel Enhanced Speedstep and newer
+	  AMD CPUs.
 
 	  To compile this driver as a module, choose M here: the
 	  module will be called acpi-cpufreq.
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 56c6c6b..067a61f 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -54,10 +54,12 @@ MODULE_LICENSE("GPL");
 enum {
 	UNDEFINED_CAPABLE = 0,
 	SYSTEM_INTEL_MSR_CAPABLE,
+	SYSTEM_AMD_MSR_CAPABLE,
 	SYSTEM_IO_CAPABLE,
 };
 
 #define INTEL_MSR_RANGE		(0xffff)
+#define AMD_MSR_RANGE		(0x7)
 
 struct acpi_cpufreq_data {
 	struct acpi_processor_performance *acpi_data;
@@ -82,6 +84,13 @@ static int check_est_cpu(unsigned int cpuid)
 	return cpu_has(cpu, X86_FEATURE_EST);
 }
 
+static int check_amd_hwpstate_cpu(unsigned int cpuid)
+{
+	struct cpuinfo_x86 *cpu = &cpu_data(cpuid);
+
+	return cpu_has(cpu, X86_FEATURE_HW_PSTATE);
+}
+
 static unsigned extract_io(u32 value, struct acpi_cpufreq_data *data)
 {
 	struct acpi_processor_performance *perf;
@@ -101,7 +110,11 @@ static unsigned extract_msr(u32 msr, struct acpi_cpufreq_data *data)
 	int i;
 	struct acpi_processor_performance *perf;
 
-	msr &= INTEL_MSR_RANGE;
+	if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+		msr &= AMD_MSR_RANGE;
+	else
+		msr &= INTEL_MSR_RANGE;
+
 	perf = data->acpi_data;
 
 	for (i = 0; data->freq_table[i].frequency != CPUFREQ_TABLE_END; i++) {
@@ -115,6 +128,7 @@ static unsigned extract_freq(u32 val, struct acpi_cpufreq_data *data)
 {
 	switch (data->cpu_feature) {
 	case SYSTEM_INTEL_MSR_CAPABLE:
+	case SYSTEM_AMD_MSR_CAPABLE:
 		return extract_msr(val, data);
 	case SYSTEM_IO_CAPABLE:
 		return extract_io(val, data);
@@ -150,6 +164,7 @@ static void do_drv_read(void *_cmd)
 
 	switch (cmd->type) {
 	case SYSTEM_INTEL_MSR_CAPABLE:
+	case SYSTEM_AMD_MSR_CAPABLE:
 		rdmsr(cmd->addr.msr.reg, cmd->val, h);
 		break;
 	case SYSTEM_IO_CAPABLE:
@@ -174,6 +189,9 @@ static void do_drv_write(void *_cmd)
 		lo = (lo & ~INTEL_MSR_RANGE) | (cmd->val & INTEL_MSR_RANGE);
 		wrmsr(cmd->addr.msr.reg, lo, hi);
 		break;
+	case SYSTEM_AMD_MSR_CAPABLE:
+		wrmsr(cmd->addr.msr.reg, cmd->val, 0);
+		break;
 	case SYSTEM_IO_CAPABLE:
 		acpi_os_write_port((acpi_io_address)cmd->addr.io.port,
 				cmd->val,
@@ -217,6 +235,10 @@ static u32 get_cur_val(const struct cpumask *mask)
 		cmd.type = SYSTEM_INTEL_MSR_CAPABLE;
 		cmd.addr.msr.reg = MSR_IA32_PERF_STATUS;
 		break;
+	case SYSTEM_AMD_MSR_CAPABLE:
+		cmd.type = SYSTEM_AMD_MSR_CAPABLE;
+		cmd.addr.msr.reg = MSR_AMD_PERF_STATUS;
+		break;
 	case SYSTEM_IO_CAPABLE:
 		cmd.type = SYSTEM_IO_CAPABLE;
 		perf = per_cpu(acfreq_data, cpumask_first(mask))->acpi_data;
@@ -326,6 +348,11 @@ static int acpi_cpufreq_target(struct cpufreq_policy *policy,
 		cmd.addr.msr.reg = MSR_IA32_PERF_CTL;
 		cmd.val = (u32) perf->states[next_perf_state].control;
 		break;
+	case SYSTEM_AMD_MSR_CAPABLE:
+		cmd.type = SYSTEM_AMD_MSR_CAPABLE;
+		cmd.addr.msr.reg = MSR_AMD_PERF_CTL;
+		cmd.val = (u32) perf->states[next_perf_state].control;
+		break;
 	case SYSTEM_IO_CAPABLE:
 		cmd.type = SYSTEM_IO_CAPABLE;
 		cmd.addr.io.port = perf->control_register.address;
@@ -580,12 +607,16 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
 		break;
 	case ACPI_ADR_SPACE_FIXED_HARDWARE:
 		pr_debug("HARDWARE addr space\n");
-		if (!check_est_cpu(cpu)) {
-			result = -ENODEV;
-			goto err_unreg;
+		if (check_est_cpu(cpu)) {
+			data->cpu_feature = SYSTEM_INTEL_MSR_CAPABLE;
+			break;
 		}
-		data->cpu_feature = SYSTEM_INTEL_MSR_CAPABLE;
-		break;
+		if (check_amd_hwpstate_cpu(cpu)) {
+			data->cpu_feature = SYSTEM_AMD_MSR_CAPABLE;
+			break;
+		}
+		result = -ENODEV;
+		goto err_unreg;
 	default:
 		pr_debug("Unknown addr space %d\n",
 			(u32) (perf->control_register.space_id));
-- 
1.7.12



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/8 v2] acpi-cpufreq: Add quirk to disable _PSD usage on all AMD CPUs
  2012-09-04  8:28 [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq Andre Przywara
  2012-09-04  8:28 ` [PATCH 1/8 v2] acpi-cpufreq: Add support for modern AMD CPUs Andre Przywara
@ 2012-09-04  8:28 ` Andre Przywara
       [not found]   ` <CACJDEmrOoZ_azqTrtfoGC-O=H0V=AHy8VExoFYYPhbUUx+7UAg@mail.gmail.com>
  2012-09-04  8:28 ` [PATCH 3/8 v2] cpufreq: Add warning message to powernow-k8 Andre Przywara
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 14+ messages in thread
From: Andre Przywara @ 2012-09-04  8:28 UTC (permalink / raw)
  To: Rafael J. Wysocki, Thomas Renninger
  Cc: Matthew Garret, linux-pm, cpufreq, linux-kernel,
	Andreas Herrmann, Andre Przywara

To workaround some Windows specific behavior, the ACPI _PSD table
on AMD desktop boards advertises all cores as dependent, meaning
that they all can only use the same P-state. acpi-cpufreq strictly
obeys this description, instantiating one CPU only and symlinking
the others. But the hardware can have distinct frequencies for each
core and powernow-k8 did it that way.
So, in order to use the hardware to its full potential and keep the
original powernow-k8 behavior, lets override the _PSD table setting
on AMD hardware.
We use the siblings table, as it matches the current hardware
behavior.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
---
 drivers/cpufreq/acpi-cpufreq.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 067a61f..70e7173 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -51,6 +51,8 @@ MODULE_AUTHOR("Paul Diefenbaugh, Dominik Brodowski");
 MODULE_DESCRIPTION("ACPI Processor P-States Driver");
 MODULE_LICENSE("GPL");
 
+#define PFX "acpi-cpufreq: "
+
 enum {
 	UNDEFINED_CAPABLE = 0,
 	SYSTEM_INTEL_MSR_CAPABLE,
@@ -586,6 +588,14 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
 		policy->shared_type = CPUFREQ_SHARED_TYPE_ALL;
 		cpumask_copy(policy->cpus, cpu_core_mask(cpu));
 	}
+
+	if (check_amd_hwpstate_cpu(cpu) && !acpi_pstate_strict) {
+		cpumask_clear(policy->cpus);
+		cpumask_set_cpu(cpu, policy->cpus);
+		cpumask_copy(policy->related_cpus, cpu_sibling_mask(cpu));
+		policy->shared_type = CPUFREQ_SHARED_TYPE_HW;
+		pr_info_once(PFX "overriding BIOS provided _PSD data\n");
+	}
 #endif
 
 	/* capability check */
-- 
1.7.12



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/8 v2] cpufreq: Add warning message to powernow-k8
  2012-09-04  8:28 [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq Andre Przywara
  2012-09-04  8:28 ` [PATCH 1/8 v2] acpi-cpufreq: Add support for modern AMD CPUs Andre Przywara
  2012-09-04  8:28 ` [PATCH 2/8 v2] acpi-cpufreq: Add quirk to disable _PSD usage on all " Andre Przywara
@ 2012-09-04  8:28 ` Andre Przywara
  2012-09-04  8:28 ` [PATCH 4/8 v2] powernow-k8: delay info messages until initialization has succeeded Andre Przywara
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Andre Przywara @ 2012-09-04  8:28 UTC (permalink / raw)
  To: Rafael J. Wysocki, Thomas Renninger
  Cc: Matthew Garret, linux-pm, cpufreq, linux-kernel,
	Andreas Herrmann, Andre Przywara

cpufreq modules are often loaded from init scripts that assume that
all recent AMD systems will use powernow-k8.
To inform the user of the change of support and ease the transition
to acpi-cpufreq, emit a warning message.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
---
 drivers/cpufreq/Kconfig.x86   | 3 ++-
 drivers/cpufreq/powernow-k8.c | 3 +++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86
index 8d12e37..b36ca1f 100644
--- a/drivers/cpufreq/Kconfig.x86
+++ b/drivers/cpufreq/Kconfig.x86
@@ -96,7 +96,8 @@ config X86_POWERNOW_K8
 	select CPU_FREQ_TABLE
 	depends on ACPI && ACPI_PROCESSOR
 	help
-	  This adds the CPUFreq driver for K8/K10 Opteron/Athlon64 processors.
+	  This adds the CPUFreq driver for K8/early Opteron/Athlon64 processors.
+	  Support for K10 and newer processors is now in acpi-cpufreq.
 
 	  To compile this driver as a module, choose M here: the
 	  module will be called powernow-k8.
diff --git a/drivers/cpufreq/powernow-k8.c b/drivers/cpufreq/powernow-k8.c
index c0e8164..16c7fb6 100644
--- a/drivers/cpufreq/powernow-k8.c
+++ b/drivers/cpufreq/powernow-k8.c
@@ -1560,6 +1560,9 @@ static int __cpuinit powernowk8_init(void)
 	if (!x86_match_cpu(powernow_k8_ids))
 		return -ENODEV;
 
+	if (static_cpu_has(X86_FEATURE_HW_PSTATE))
+		pr_warn(PFX "support for this CPU is deprecated, use acpi-cpufreq instead.\n");
+
 	for_each_online_cpu(i) {
 		int rc;
 		smp_call_function_single(i, check_supported_cpu, &rc, 1);
-- 
1.7.12



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/8 v2] powernow-k8: delay info messages until initialization has succeeded
  2012-09-04  8:28 [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq Andre Przywara
                   ` (2 preceding siblings ...)
  2012-09-04  8:28 ` [PATCH 3/8 v2] cpufreq: Add warning message to powernow-k8 Andre Przywara
@ 2012-09-04  8:28 ` Andre Przywara
  2012-09-04  8:28 ` [PATCH 5/8 v2] ACPI: Add fixups for AMD P-state figures Andre Przywara
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Andre Przywara @ 2012-09-04  8:28 UTC (permalink / raw)
  To: Rafael J. Wysocki, Thomas Renninger
  Cc: Matthew Garret, linux-pm, cpufreq, linux-kernel,
	Andreas Herrmann, Andre Przywara

powernow-k8 is quite prematurely crying Hooray and outputs diagnostic
messages, although the actual initialization can still fail.
Since now we may have acpi-cpufreq already loaded, we move the
messages at the end of the init routine to avoid confusing output
if the loading of powernow-k8 should not succeed.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
---
 drivers/cpufreq/powernow-k8.c | 24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/cpufreq/powernow-k8.c b/drivers/cpufreq/powernow-k8.c
index 16c7fb6..8ff0621 100644
--- a/drivers/cpufreq/powernow-k8.c
+++ b/drivers/cpufreq/powernow-k8.c
@@ -1573,9 +1573,6 @@ static int __cpuinit powernowk8_init(void)
 	if (supported_cpus != num_online_cpus())
 		return -ENODEV;
 
-	printk(KERN_INFO PFX "Found %d %s (%d cpu cores) (" VERSION ")\n",
-		num_online_nodes(), boot_cpu_data.x86_model_id, supported_cpus);
-
 	if (boot_cpu_has(X86_FEATURE_CPB)) {
 
 		cpb_capable = true;
@@ -1594,16 +1591,23 @@ static int __cpuinit powernowk8_init(void)
 			struct msr *reg = per_cpu_ptr(msrs, cpu);
 			cpb_enabled |= !(!!(reg->l & BIT(25)));
 		}
-
-		printk(KERN_INFO PFX "Core Performance Boosting: %s.\n",
-			(cpb_enabled ? "on" : "off"));
 	}
 
 	rv = cpufreq_register_driver(&cpufreq_amd64_driver);
-	if (rv < 0 && boot_cpu_has(X86_FEATURE_CPB)) {
-		unregister_cpu_notifier(&cpb_nb);
-		msrs_free(msrs);
-		msrs = NULL;
+
+	if (!rv)
+		pr_info(PFX "Found %d %s (%d cpu cores) (" VERSION ")\n",
+			num_online_nodes(), boot_cpu_data.x86_model_id,
+			supported_cpus);
+
+	if (boot_cpu_has(X86_FEATURE_CPB)) {
+		if (rv < 0) {
+			unregister_cpu_notifier(&cpb_nb);
+			msrs_free(msrs);
+			msrs = NULL;
+		} else
+			pr_info(PFX "Core Performance Boosting: %s.\n",
+				(cpb_enabled ? "on" : "off"));
 	}
 	return rv;
 }
-- 
1.7.12



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/8 v2] ACPI: Add fixups for AMD P-state figures
  2012-09-04  8:28 [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq Andre Przywara
                   ` (3 preceding siblings ...)
  2012-09-04  8:28 ` [PATCH 4/8 v2] powernow-k8: delay info messages until initialization has succeeded Andre Przywara
@ 2012-09-04  8:28 ` Andre Przywara
  2012-09-04  8:28 ` [PATCH 6/8 v2] acpi-cpufreq: Add support for disabling dynamic overclocking Andre Przywara
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Andre Przywara @ 2012-09-04  8:28 UTC (permalink / raw)
  To: Rafael J. Wysocki, Thomas Renninger
  Cc: Matthew Garret, linux-pm, cpufreq, linux-kernel,
	Andreas Herrmann, Andre Przywara

From: Matthew Garrett <mjg@redhat.com>

Some AMD systems may round the frequencies in ACPI tables to 100MHz
boundaries. We can obtain the real frequencies from MSRs, so add a quirk
to fix these frequencies up on AMD systems.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
---
 arch/x86/include/asm/msr-index.h |  1 +
 drivers/acpi/processor_perflib.c | 30 ++++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 1e1f3eb..fbee971 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -248,6 +248,7 @@
 
 #define MSR_IA32_PERF_STATUS		0x00000198
 #define MSR_IA32_PERF_CTL		0x00000199
+#define MSR_AMD_PSTATE_DEF_BASE		0xc0010064
 #define MSR_AMD_PERF_STATUS		0xc0010063
 #define MSR_AMD_PERF_CTL		0xc0010062
 
diff --git a/drivers/acpi/processor_perflib.c b/drivers/acpi/processor_perflib.c
index a093dc1..836bfe0 100644
--- a/drivers/acpi/processor_perflib.c
+++ b/drivers/acpi/processor_perflib.c
@@ -324,6 +324,34 @@ static int acpi_processor_get_performance_control(struct acpi_processor *pr)
 	return result;
 }
 
+#ifdef CONFIG_X86
+/*
+ * Some AMDs have 50MHz frequency multiples, but only provide 100MHz rounding
+ * in their ACPI data. Calculate the real values and fix up the _PSS data.
+ */
+static void amd_fixup_frequency(struct acpi_processor_px *px, int i)
+{
+	u32 hi, lo, fid, did;
+	int index = px->control & 0x00000007;
+
+	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
+		return;
+
+	if ((boot_cpu_data.x86 == 0x10 && boot_cpu_data.x86_model < 10)
+	    || boot_cpu_data.x86 == 0x11) {
+		rdmsr(MSR_AMD_PSTATE_DEF_BASE + index, lo, hi);
+		fid = lo & 0x3f;
+		did = (lo >> 6) & 7;
+		if (boot_cpu_data.x86 == 0x10)
+			px->core_frequency = (100 * (fid + 0x10)) >> did;
+		else
+			px->core_frequency = (100 * (fid + 8)) >> did;
+	}
+}
+#else
+static void amd_fixup_frequency(struct acpi_processor_px *px, int i) {};
+#endif
+
 static int acpi_processor_get_performance_states(struct acpi_processor *pr)
 {
 	int result = 0;
@@ -379,6 +407,8 @@ static int acpi_processor_get_performance_states(struct acpi_processor *pr)
 			goto end;
 		}
 
+		amd_fixup_frequency(px, i);
+
 		ACPI_DEBUG_PRINT((ACPI_DB_INFO,
 				  "State [%d]: core_frequency[%d] power[%d] transition_latency[%d] bus_master_latency[%d] control[0x%x] status[0x%x]\n",
 				  i,
-- 
1.7.12



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/8 v2] acpi-cpufreq: Add support for disabling dynamic overclocking
  2012-09-04  8:28 [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq Andre Przywara
                   ` (4 preceding siblings ...)
  2012-09-04  8:28 ` [PATCH 5/8 v2] ACPI: Add fixups for AMD P-state figures Andre Przywara
@ 2012-09-04  8:28 ` Andre Przywara
  2012-09-04  8:28 ` [PATCH 7/8 v2] acpi-cpufreq: Add compatibility for legacy AMD cpb sysfs knob Andre Przywara
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Andre Przywara @ 2012-09-04  8:28 UTC (permalink / raw)
  To: Rafael J. Wysocki, Thomas Renninger
  Cc: Matthew Garret, linux-pm, cpufreq, linux-kernel,
	Andreas Herrmann, Andre Przywara

One feature present in powernow-k8 that isn't present in acpi-cpufreq
is support for enabling or disabling AMD's core performance boost
technology. This patch adds support to acpi-cpufreq, but also
includes support for Intel's dynamic acceleration.

The original boost disabling sysfs file was per CPU, but acted
globally. Also the naming (cpb) was at least not intuitive.
So lets introduce a single file simply called "boost", which sits
once in /sys/devices/system/cpu/cpufreq.
This should be the only way of using this feature, so add
documentation about the rationale and the usage.

A following patch will re-introduce the cpb knob for compatibility
reasons on AMD CPUs.

Per-CPU boost switching is possible, but not trivial and is thus
postponed to a later patch series.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
---
 Documentation/ABI/testing/sysfs-devices-system-cpu |  11 ++
 Documentation/cpu-freq/boost.txt                   |  93 +++++++++++
 drivers/cpufreq/acpi-cpufreq.c                     | 177 +++++++++++++++++++++
 3 files changed, 281 insertions(+)
 create mode 100644 Documentation/cpu-freq/boost.txt

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 5dab364..6943133 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -176,3 +176,14 @@ Description:	Disable L3 cache indices
 		All AMD processors with L3 caches provide this functionality.
 		For details, see BKDGs at
 		http://developer.amd.com/documentation/guides/Pages/default.aspx
+
+
+What:		/sys/devices/system/cpu/cpufreq/boost
+Date:		August 2012
+Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
+Description:	Processor frequency boosting control
+
+		This switch controls the boost setting for the whole system.
+		Boosting allows the CPU and the firmware to run at a frequency
+		beyound it's nominal limit.
+		More details can be found in Documentation/cpu-freq/boost.txt
diff --git a/Documentation/cpu-freq/boost.txt b/Documentation/cpu-freq/boost.txt
new file mode 100644
index 0000000..9b4edfc
--- /dev/null
+++ b/Documentation/cpu-freq/boost.txt
@@ -0,0 +1,93 @@
+Processor boosting control
+
+	- information for users -
+
+Quick guide for the impatient:
+--------------------
+/sys/devices/system/cpu/cpufreq/boost
+controls the boost setting for the whole system. You can read and write
+that file with either "0" (boosting disabled) or "1" (boosting allowed).
+Reading or writing 1 does not mean that the system is boosting at this
+very moment, but only that the CPU _may_ raise the frequency at it's
+discretion.
+--------------------
+
+Introduction
+-------------
+Some CPUs support a functionality to raise the operating frequency of
+some cores in a multi-core package if certain conditions apply, mostly
+if the whole chip is not fully utilized and below it's intended thermal
+budget. This is done without operating system control by a combination
+of hardware and firmware.
+On Intel CPUs this is called "Turbo Boost", AMD calls it "Turbo-Core",
+in technical documentation "Core performance boost". In Linux we use
+the term "boost" for convenience.
+
+Rationale for disable switch
+----------------------------
+
+Though the idea is to just give better performance without any user
+intervention, sometimes the need arises to disable this functionality.
+Most systems offer a switch in the (BIOS) firmware to disable the
+functionality at all, but a more fine-grained and dynamic control would
+be desirable:
+1. While running benchmarks, reproducible results are important. Since
+   the boosting functionality depends on the load of the whole package,
+   single thread performance can vary. By explicitly disabling the boost
+   functionality at least for the benchmark's run-time the system will run
+   at a fixed frequency and results are reproducible again.
+2. To examine the impact of the boosting functionality it is helpful
+   to do tests with and without boosting.
+3. Boosting means overclocking the processor, though under controlled
+   conditions. By raising the frequency and the voltage the processor
+   will consume more power than without the boosting, which may be
+   undesirable for instance for mobile users. Disabling boosting may
+   save power here, though this depends on the workload.
+
+
+User controlled switch
+----------------------
+
+To allow the user to toggle the boosting functionality, the acpi-cpufreq
+driver exports a sysfs knob to disable it. There is a file:
+/sys/devices/system/cpu/cpufreq/boost
+which can either read "0" (boosting disabled) or "1" (boosting enabled).
+Reading the file is always supported, even if the processor does not
+support boosting. In this case the file will be read-only and always
+reads as "0". Explicitly changing the permissions and writing to that
+file anyway will return EINVAL.
+
+On supported CPUs one can write either a "0" or a "1" into this file.
+This will either disable the boost functionality on all cores in the
+whole system (0) or will allow the hardware to boost at will (1).
+
+Writing a "1" does not explicitly boost the system, but just allows the
+CPU (and the firmware) to boost at their discretion. Some implementations
+take external factors like the chip's temperature into account, so
+boosting once does not necessarily mean that it will occur every time
+even using the exact same software setup.
+
+
+AMD legacy cpb switch
+---------------------
+The AMD powernow-k8 driver used to support a very similar switch to
+disable or enable the "Core Performance Boost" feature of some AMD CPUs.
+This switch was instantiated in each CPU's cpufreq directory
+(/sys/devices/system/cpu[0-9]*/cpufreq) and was called "cpb".
+Though the per CPU existence hints at a more fine grained control, the
+actual implementation only supported a system-global switch semantics,
+which was simply reflected into each CPU's file. Writing a 0 or 1 into it
+would pull the other CPUs to the same state.
+For compatibility reasons this file and its behavior is still supported
+on AMD CPUs, though it is now protected by a config switch
+(X86_ACPI_CPUFREQ_CPB). On Intel CPUs this file will never be created,
+even with the config option set.
+This functionality is considered legacy and will be removed in some future
+kernel version.
+
+More fine grained boosting control
+----------------------------------
+
+Technically it is possible to switch the boosting functionality at least
+on a per package basis, for some CPUs even per core. Currently the driver
+does not support it, but this may be implemented in the future.
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 70e7173..dffa7af 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -63,6 +63,8 @@ enum {
 #define INTEL_MSR_RANGE		(0xffff)
 #define AMD_MSR_RANGE		(0x7)
 
+#define MSR_K7_HWCR_CPB_DIS	(1ULL << 25)
+
 struct acpi_cpufreq_data {
 	struct acpi_processor_performance *acpi_data;
 	struct cpufreq_frequency_table *freq_table;
@@ -78,6 +80,96 @@ static struct acpi_processor_performance __percpu *acpi_perf_data;
 static struct cpufreq_driver acpi_cpufreq_driver;
 
 static unsigned int acpi_pstate_strict;
+static bool boost_enabled, boost_supported;
+static struct msr __percpu *msrs;
+
+static bool boost_state(unsigned int cpu)
+{
+	u32 lo, hi;
+	u64 msr;
+
+	switch (boot_cpu_data.x86_vendor) {
+	case X86_VENDOR_INTEL:
+		rdmsr_on_cpu(cpu, MSR_IA32_MISC_ENABLE, &lo, &hi);
+		msr = lo | ((u64)hi << 32);
+		return !(msr & MSR_IA32_MISC_ENABLE_TURBO_DISABLE);
+	case X86_VENDOR_AMD:
+		rdmsr_on_cpu(cpu, MSR_K7_HWCR, &lo, &hi);
+		msr = lo | ((u64)hi << 32);
+		return !(msr & MSR_K7_HWCR_CPB_DIS);
+	}
+	return false;
+}
+
+static void boost_set_msrs(bool enable, const struct cpumask *cpumask)
+{
+	u32 cpu;
+	u32 msr_addr;
+	u64 msr_mask;
+
+	switch (boot_cpu_data.x86_vendor) {
+	case X86_VENDOR_INTEL:
+		msr_addr = MSR_IA32_MISC_ENABLE;
+		msr_mask = MSR_IA32_MISC_ENABLE_TURBO_DISABLE;
+		break;
+	case X86_VENDOR_AMD:
+		msr_addr = MSR_K7_HWCR;
+		msr_mask = MSR_K7_HWCR_CPB_DIS;
+		break;
+	default:
+		return;
+	}
+
+	rdmsr_on_cpus(cpumask, msr_addr, msrs);
+
+	for_each_cpu(cpu, cpumask) {
+		struct msr *reg = per_cpu_ptr(msrs, cpu);
+		if (enable)
+			reg->q &= ~msr_mask;
+		else
+			reg->q |= msr_mask;
+	}
+
+	wrmsr_on_cpus(cpumask, msr_addr, msrs);
+}
+
+static ssize_t store_global_boost(struct kobject *kobj, struct attribute *attr,
+				  const char *buf, size_t count)
+{
+	int ret;
+	unsigned long val = 0;
+
+	if (!boost_supported)
+		return -EINVAL;
+
+	ret = kstrtoul(buf, 10, &val);
+	if (ret || (val > 1))
+		return -EINVAL;
+
+	if ((val && boost_enabled) || (!val && !boost_enabled))
+		return count;
+
+	get_online_cpus();
+
+	boost_set_msrs(val, cpu_online_mask);
+
+	put_online_cpus();
+
+	boost_enabled = val;
+	pr_debug("Core Boosting %sabled.\n", val ? "en" : "dis");
+
+	return count;
+}
+
+static ssize_t show_global_boost(struct kobject *kobj,
+				 struct attribute *attr, char *buf)
+{
+	return sprintf(buf, "%u\n", boost_enabled);
+}
+
+static struct global_attr global_boost = __ATTR(boost, 0644,
+						show_global_boost,
+						store_global_boost);
 
 static int check_est_cpu(unsigned int cpuid)
 {
@@ -448,6 +540,44 @@ static void free_acpi_perf_data(void)
 	free_percpu(acpi_perf_data);
 }
 
+static int boost_notify(struct notifier_block *nb, unsigned long action,
+		      void *hcpu)
+{
+	unsigned cpu = (long)hcpu;
+	const struct cpumask *cpumask;
+
+	cpumask = get_cpu_mask(cpu);
+
+	/*
+	 * Clear the boost-disable bit on the CPU_DOWN path so that
+	 * this cpu cannot block the remaining ones from boosting. On
+	 * the CPU_UP path we simply keep the boost-disable flag in
+	 * sync with the current global state.
+	 */
+
+	switch (action) {
+	case CPU_UP_PREPARE:
+	case CPU_UP_PREPARE_FROZEN:
+		boost_set_msrs(boost_enabled, cpumask);
+		break;
+
+	case CPU_DOWN_PREPARE:
+	case CPU_DOWN_PREPARE_FROZEN:
+		boost_set_msrs(1, cpumask);
+		break;
+
+	default:
+		break;
+	}
+
+	return NOTIFY_OK;
+}
+
+
+static struct notifier_block boost_nb = {
+	.notifier_call          = boost_notify,
+};
+
 /*
  * acpi_cpufreq_early_init - initialize ACPI P-States library
  *
@@ -774,6 +904,49 @@ static struct cpufreq_driver acpi_cpufreq_driver = {
 	.attr		= acpi_cpufreq_attr,
 };
 
+static void __init acpi_cpufreq_boost_init(void)
+{
+	if (boot_cpu_has(X86_FEATURE_CPB) || boot_cpu_has(X86_FEATURE_IDA)) {
+		msrs = msrs_alloc();
+
+		if (!msrs)
+			return;
+
+		boost_supported = true;
+		boost_enabled = boost_state(0);
+
+		get_online_cpus();
+
+		/* Force all MSRs to the same value */
+		boost_set_msrs(boost_enabled, cpu_online_mask);
+
+		register_cpu_notifier(&boost_nb);
+
+		put_online_cpus();
+	} else
+		global_boost.attr.mode = 0444;
+
+	/* We create the boost file in any case, though for systems without
+	 * hardware support it will be read-only and hardwired to return 0.
+	 */
+	if (sysfs_create_file(cpufreq_global_kobject, &(global_boost.attr)))
+		pr_warn(PFX "could not register global boost sysfs file\n");
+	else
+		pr_debug("registered global boost sysfs file\n");
+}
+
+static void __exit acpi_cpufreq_boost_exit(void)
+{
+	sysfs_remove_file(cpufreq_global_kobject, &(global_boost.attr));
+
+	if (msrs) {
+		unregister_cpu_notifier(&boost_nb);
+
+		msrs_free(msrs);
+		msrs = NULL;
+	}
+}
+
 static int __init acpi_cpufreq_init(void)
 {
 	int ret;
@@ -790,6 +963,8 @@ static int __init acpi_cpufreq_init(void)
 	ret = cpufreq_register_driver(&acpi_cpufreq_driver);
 	if (ret)
 		free_acpi_perf_data();
+	else
+		acpi_cpufreq_boost_init();
 
 	return ret;
 }
@@ -798,6 +973,8 @@ static void __exit acpi_cpufreq_exit(void)
 {
 	pr_debug("acpi_cpufreq_exit\n");
 
+	acpi_cpufreq_boost_exit();
+
 	cpufreq_unregister_driver(&acpi_cpufreq_driver);
 
 	free_acpi_perf_data();
-- 
1.7.12



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 7/8 v2] acpi-cpufreq: Add compatibility for legacy AMD cpb sysfs knob
  2012-09-04  8:28 [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq Andre Przywara
                   ` (5 preceding siblings ...)
  2012-09-04  8:28 ` [PATCH 6/8 v2] acpi-cpufreq: Add support for disabling dynamic overclocking Andre Przywara
@ 2012-09-04  8:28 ` Andre Przywara
  2012-09-04  8:28 ` [PATCH 8/8 v2] cpufreq: Remove support for hardware P-state chips from powernow-k8 Andre Przywara
  2012-09-05 13:46 ` [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq Rafael J. Wysocki
  8 siblings, 0 replies; 14+ messages in thread
From: Andre Przywara @ 2012-09-04  8:28 UTC (permalink / raw)
  To: Rafael J. Wysocki, Thomas Renninger
  Cc: Matthew Garret, linux-pm, cpufreq, linux-kernel,
	Andreas Herrmann, Andre Przywara

The powernow-k8 driver supported a sysfs knob called "cpb", which was
instantiated per CPU, but actually acted globally for the whole
system. To keep some compatibility with this feature, we re-introduce
this behavior here, but:
a) only enable it on AMD CPUs and
b) protect it with a Kconfig switch

I'd like to consider this feature obsolete. Lets keep it around for
some kernel versions and then phase it out.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
---
 drivers/cpufreq/Kconfig.x86    | 12 +++++++++++
 drivers/cpufreq/acpi-cpufreq.c | 46 ++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86
index b36ca1f..934854a 100644
--- a/drivers/cpufreq/Kconfig.x86
+++ b/drivers/cpufreq/Kconfig.x86
@@ -33,6 +33,18 @@ config X86_ACPI_CPUFREQ
 
 	  If in doubt, say N.
 
+config X86_ACPI_CPUFREQ_CPB
+	default y
+	bool "Legacy cpb sysfs knob support for AMD CPUs"
+	depends on X86_ACPI_CPUFREQ && CPU_SUP_AMD
+	help
+	  The powernow-k8 driver used to provide a sysfs knob called "cpb"
+	  to disable the Core Performance Boosting feature of AMD CPUs. This
+	  file has now been superseeded by the more generic "boost" entry.
+
+	  By enabling this option the acpi_cpufreq driver provides the old
+	  entry in addition to the new boost ones, for compatibility reasons.
+
 config ELAN_CPUFREQ
 	tristate "AMD Elan SC400 and SC410"
 	select CPU_FREQ_TABLE
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index dffa7af..0d048f6 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -133,8 +133,7 @@ static void boost_set_msrs(bool enable, const struct cpumask *cpumask)
 	wrmsr_on_cpus(cpumask, msr_addr, msrs);
 }
 
-static ssize_t store_global_boost(struct kobject *kobj, struct attribute *attr,
-				  const char *buf, size_t count)
+static ssize_t _store_boost(const char *buf, size_t count)
 {
 	int ret;
 	unsigned long val = 0;
@@ -161,6 +160,12 @@ static ssize_t store_global_boost(struct kobject *kobj, struct attribute *attr,
 	return count;
 }
 
+static ssize_t store_global_boost(struct kobject *kobj, struct attribute *attr,
+				  const char *buf, size_t count)
+{
+	return _store_boost(buf, count);
+}
+
 static ssize_t show_global_boost(struct kobject *kobj,
 				 struct attribute *attr, char *buf)
 {
@@ -171,6 +176,21 @@ static struct global_attr global_boost = __ATTR(boost, 0644,
 						show_global_boost,
 						store_global_boost);
 
+#ifdef CONFIG_X86_ACPI_CPUFREQ_CPB
+static ssize_t store_cpb(struct cpufreq_policy *policy, const char *buf,
+			 size_t count)
+{
+	return _store_boost(buf, count);
+}
+
+static ssize_t show_cpb(struct cpufreq_policy *policy, char *buf)
+{
+	return sprintf(buf, "%u\n", boost_enabled);
+}
+
+static struct freq_attr cpb = __ATTR(cpb, 0644, show_cpb, store_cpb);
+#endif
+
 static int check_est_cpu(unsigned int cpuid)
 {
 	struct cpuinfo_x86 *cpu = &cpu_data(cpuid);
@@ -889,6 +909,7 @@ static int acpi_cpufreq_resume(struct cpufreq_policy *policy)
 
 static struct freq_attr *acpi_cpufreq_attr[] = {
 	&cpufreq_freq_attr_scaling_available_freqs,
+	NULL,	/* this is a placeholder for cpb, do not remove */
 	NULL,
 };
 
@@ -960,6 +981,27 @@ static int __init acpi_cpufreq_init(void)
 	if (ret)
 		return ret;
 
+#ifdef CONFIG_X86_ACPI_CPUFREQ_CPB
+	/* this is a sysfs file with a strange name and an even stranger
+	 * semantic - per CPU instantiation, but system global effect.
+	 * Lets enable it only on AMD CPUs for compatibility reasons and
+	 * only if configured. This is considered legacy code, which
+	 * will probably be removed at some point in the future.
+	 */
+	if (check_amd_hwpstate_cpu(0)) {
+		struct freq_attr **iter;
+
+		pr_debug("adding sysfs entry for cpb\n");
+
+		for (iter = acpi_cpufreq_attr; *iter != NULL; iter++)
+			;
+
+		/* make sure there is a terminator behind it */
+		if (iter[1] == NULL)
+			*iter = &cpb;
+	}
+#endif
+
 	ret = cpufreq_register_driver(&acpi_cpufreq_driver);
 	if (ret)
 		free_acpi_perf_data();
-- 
1.7.12



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 8/8 v2] cpufreq: Remove support for hardware P-state chips from powernow-k8
  2012-09-04  8:28 [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq Andre Przywara
                   ` (6 preceding siblings ...)
  2012-09-04  8:28 ` [PATCH 7/8 v2] acpi-cpufreq: Add compatibility for legacy AMD cpb sysfs knob Andre Przywara
@ 2012-09-04  8:28 ` Andre Przywara
  2012-09-05 13:46 ` [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq Rafael J. Wysocki
  8 siblings, 0 replies; 14+ messages in thread
From: Andre Przywara @ 2012-09-04  8:28 UTC (permalink / raw)
  To: Rafael J. Wysocki, Thomas Renninger
  Cc: Matthew Garret, linux-pm, cpufreq, linux-kernel,
	Andreas Herrmann, Andre Przywara

From: Matthew Garrett <mjg@redhat.com>

These chips are now supported by acpi-cpufreq, so we can delete all the
code handling them.

Andre: Tighten the deprecation warning message. Trigger load of
acpi-cpufreq and let the load of the module finally fail.
This avoids the problem of users ending up without any cpufreq support
after the transition.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
---
 drivers/cpufreq/Makefile      |   2 +-
 drivers/cpufreq/powernow-k8.c | 392 +++---------------------------------------
 drivers/cpufreq/powernow-k8.h |  32 ----
 3 files changed, 29 insertions(+), 397 deletions(-)

diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index 9531fc2..b99790f 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -19,7 +19,7 @@ obj-$(CONFIG_CPU_FREQ_TABLE)		+= freq_table.o
 # K8 systems. ACPI is preferred to all other hardware-specific drivers.
 # speedstep-* is preferred over p4-clockmod.
 
-obj-$(CONFIG_X86_POWERNOW_K8)		+= powernow-k8.o mperf.o
+obj-$(CONFIG_X86_POWERNOW_K8)		+= powernow-k8.o
 obj-$(CONFIG_X86_ACPI_CPUFREQ)		+= acpi-cpufreq.o mperf.o
 obj-$(CONFIG_X86_PCC_CPUFREQ)		+= pcc-cpufreq.o
 obj-$(CONFIG_X86_POWERNOW_K6)		+= powernow-k6.o
diff --git a/drivers/cpufreq/powernow-k8.c b/drivers/cpufreq/powernow-k8.c
index 8ff0621..c8bfeff 100644
--- a/drivers/cpufreq/powernow-k8.c
+++ b/drivers/cpufreq/powernow-k8.c
@@ -49,22 +49,12 @@
 #define PFX "powernow-k8: "
 #define VERSION "version 2.20.00"
 #include "powernow-k8.h"
-#include "mperf.h"
 
 /* serialize freq changes  */
 static DEFINE_MUTEX(fidvid_mutex);
 
 static DEFINE_PER_CPU(struct powernow_k8_data *, powernow_data);
 
-static int cpu_family = CPU_OPTERON;
-
-/* array to map SW pstate number to acpi state */
-static u32 ps_to_as[8];
-
-/* core performance boost */
-static bool cpb_capable, cpb_enabled;
-static struct msr __percpu *msrs;
-
 static struct cpufreq_driver cpufreq_amd64_driver;
 
 #ifndef CONFIG_SMP
@@ -86,12 +76,6 @@ static u32 find_khz_freq_from_fid(u32 fid)
 	return 1000 * find_freq_from_fid(fid);
 }
 
-static u32 find_khz_freq_from_pstate(struct cpufreq_frequency_table *data,
-				     u32 pstate)
-{
-	return data[ps_to_as[pstate]].frequency;
-}
-
 /* Return the vco fid for an input fid
  *
  * Each "low" fid has corresponding "high" fid, and you can get to "low" fids
@@ -114,9 +98,6 @@ static int pending_bit_stuck(void)
 {
 	u32 lo, hi;
 
-	if (cpu_family == CPU_HW_PSTATE)
-		return 0;
-
 	rdmsr(MSR_FIDVID_STATUS, lo, hi);
 	return lo & MSR_S_LO_CHANGE_PENDING ? 1 : 0;
 }
@@ -130,20 +111,6 @@ static int query_current_values_with_pending_wait(struct powernow_k8_data *data)
 	u32 lo, hi;
 	u32 i = 0;
 
-	if (cpu_family == CPU_HW_PSTATE) {
-		rdmsr(MSR_PSTATE_STATUS, lo, hi);
-		i = lo & HW_PSTATE_MASK;
-		data->currpstate = i;
-
-		/*
-		 * a workaround for family 11h erratum 311 might cause
-		 * an "out-of-range Pstate if the core is in Pstate-0
-		 */
-		if ((boot_cpu_data.x86 == 0x11) && (i >= data->numps))
-			data->currpstate = HW_PSTATE_0;
-
-		return 0;
-	}
 	do {
 		if (i++ > 10000) {
 			pr_debug("detected change pending stuck\n");
@@ -300,14 +267,6 @@ static int decrease_vid_code_by_step(struct powernow_k8_data *data,
 	return 0;
 }
 
-/* Change hardware pstate by single MSR write */
-static int transition_pstate(struct powernow_k8_data *data, u32 pstate)
-{
-	wrmsr(MSR_PSTATE_CTRL, pstate, 0);
-	data->currpstate = pstate;
-	return 0;
-}
-
 /* Change Opteron/Athlon64 fid and vid, by the 3 phases. */
 static int transition_fid_vid(struct powernow_k8_data *data,
 		u32 reqfid, u32 reqvid)
@@ -524,8 +483,6 @@ static int core_voltage_post_transition(struct powernow_k8_data *data,
 static const struct x86_cpu_id powernow_k8_ids[] = {
 	/* IO based frequency switching */
 	{ X86_VENDOR_AMD, 0xf },
-	/* MSR based frequency switching supported */
-	X86_FEATURE_MATCH(X86_FEATURE_HW_PSTATE),
 	{}
 };
 MODULE_DEVICE_TABLE(x86cpu, powernow_k8_ids);
@@ -561,15 +518,8 @@ static void check_supported_cpu(void *_rc)
 				"Power state transitions not supported\n");
 			return;
 		}
-	} else { /* must be a HW Pstate capable processor */
-		cpuid(CPUID_FREQ_VOLT_CAPABILITIES, &eax, &ebx, &ecx, &edx);
-		if ((edx & USE_HW_PSTATE) == USE_HW_PSTATE)
-			cpu_family = CPU_HW_PSTATE;
-		else
-			return;
+		*rc = 0;
 	}
-
-	*rc = 0;
 }
 
 static int check_pst_table(struct powernow_k8_data *data, struct pst_s *pst,
@@ -633,18 +583,11 @@ static void print_basics(struct powernow_k8_data *data)
 	for (j = 0; j < data->numps; j++) {
 		if (data->powernow_table[j].frequency !=
 				CPUFREQ_ENTRY_INVALID) {
-			if (cpu_family == CPU_HW_PSTATE) {
-				printk(KERN_INFO PFX
-					"   %d : pstate %d (%d MHz)\n", j,
-					data->powernow_table[j].index,
-					data->powernow_table[j].frequency/1000);
-			} else {
 				printk(KERN_INFO PFX
 					"fid 0x%x (%d MHz), vid 0x%x\n",
 					data->powernow_table[j].index & 0xff,
 					data->powernow_table[j].frequency/1000,
 					data->powernow_table[j].index >> 8);
-			}
 		}
 	}
 	if (data->batps)
@@ -652,20 +595,6 @@ static void print_basics(struct powernow_k8_data *data)
 				data->batps);
 }
 
-static u32 freq_from_fid_did(u32 fid, u32 did)
-{
-	u32 mhz = 0;
-
-	if (boot_cpu_data.x86 == 0x10)
-		mhz = (100 * (fid + 0x10)) >> did;
-	else if (boot_cpu_data.x86 == 0x11)
-		mhz = (100 * (fid + 8)) >> did;
-	else
-		BUG();
-
-	return mhz * 1000;
-}
-
 static int fill_powernow_table(struct powernow_k8_data *data,
 		struct pst_s *pst, u8 maxvid)
 {
@@ -825,7 +754,7 @@ static void powernow_k8_acpi_pst_values(struct powernow_k8_data *data,
 {
 	u64 control;
 
-	if (!data->acpi_data.state_count || (cpu_family == CPU_HW_PSTATE))
+	if (!data->acpi_data.state_count)
 		return;
 
 	control = data->acpi_data.states[index].control;
@@ -876,10 +805,7 @@ static int powernow_k8_cpu_init_acpi(struct powernow_k8_data *data)
 	data->numps = data->acpi_data.state_count;
 	powernow_k8_acpi_pst_values(data, 0);
 
-	if (cpu_family == CPU_HW_PSTATE)
-		ret_val = fill_powernow_table_pstate(data, powernow_table);
-	else
-		ret_val = fill_powernow_table_fidvid(data, powernow_table);
+	ret_val = fill_powernow_table_fidvid(data, powernow_table);
 	if (ret_val)
 		goto err_out_mem;
 
@@ -916,51 +842,6 @@ err_out:
 	return ret_val;
 }
 
-static int fill_powernow_table_pstate(struct powernow_k8_data *data,
-		struct cpufreq_frequency_table *powernow_table)
-{
-	int i;
-	u32 hi = 0, lo = 0;
-	rdmsr(MSR_PSTATE_CUR_LIMIT, lo, hi);
-	data->max_hw_pstate = (lo & HW_PSTATE_MAX_MASK) >> HW_PSTATE_MAX_SHIFT;
-
-	for (i = 0; i < data->acpi_data.state_count; i++) {
-		u32 index;
-
-		index = data->acpi_data.states[i].control & HW_PSTATE_MASK;
-		if (index > data->max_hw_pstate) {
-			printk(KERN_ERR PFX "invalid pstate %d - "
-					"bad value %d.\n", i, index);
-			printk(KERN_ERR PFX "Please report to BIOS "
-					"manufacturer\n");
-			invalidate_entry(powernow_table, i);
-			continue;
-		}
-
-		ps_to_as[index] = i;
-
-		/* Frequency may be rounded for these */
-		if ((boot_cpu_data.x86 == 0x10 && boot_cpu_data.x86_model < 10)
-				 || boot_cpu_data.x86 == 0x11) {
-
-			rdmsr(MSR_PSTATE_DEF_BASE + index, lo, hi);
-			if (!(hi & HW_PSTATE_VALID_MASK)) {
-				pr_debug("invalid pstate %d, ignoring\n", index);
-				invalidate_entry(powernow_table, i);
-				continue;
-			}
-
-			powernow_table[i].frequency =
-				freq_from_fid_did(lo & 0x3f, (lo >> 6) & 7);
-		} else
-			powernow_table[i].frequency =
-				data->acpi_data.states[i].core_frequency * 1000;
-
-		powernow_table[i].index = index;
-	}
-	return 0;
-}
-
 static int fill_powernow_table_fidvid(struct powernow_k8_data *data,
 		struct cpufreq_frequency_table *powernow_table)
 {
@@ -1037,15 +918,7 @@ static int get_transition_latency(struct powernow_k8_data *data)
 			max_latency = cur_latency;
 	}
 	if (max_latency == 0) {
-		/*
-		 * Fam 11h and later may return 0 as transition latency. This
-		 * is intended and means "very fast". While cpufreq core and
-		 * governors currently can handle that gracefully, better set it
-		 * to 1 to avoid problems in the future.
-		 */
-		if (boot_cpu_data.x86 < 0x11)
-			printk(KERN_ERR FW_WARN PFX "Invalid zero transition "
-				"latency\n");
+		pr_err(FW_WARN PFX "Invalid zero transition latency\n");
 		max_latency = 1;
 	}
 	/* value in usecs, needs to be in nanoseconds */
@@ -1105,40 +978,6 @@ static int transition_frequency_fidvid(struct powernow_k8_data *data,
 	return res;
 }
 
-/* Take a frequency, and issue the hardware pstate transition command */
-static int transition_frequency_pstate(struct powernow_k8_data *data,
-		unsigned int index)
-{
-	u32 pstate = 0;
-	int res, i;
-	struct cpufreq_freqs freqs;
-
-	pr_debug("cpu %d transition to index %u\n", smp_processor_id(), index);
-
-	/* get MSR index for hardware pstate transition */
-	pstate = index & HW_PSTATE_MASK;
-	if (pstate > data->max_hw_pstate)
-		return -EINVAL;
-
-	freqs.old = find_khz_freq_from_pstate(data->powernow_table,
-			data->currpstate);
-	freqs.new = find_khz_freq_from_pstate(data->powernow_table, pstate);
-
-	for_each_cpu(i, data->available_cores) {
-		freqs.cpu = i;
-		cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
-	}
-
-	res = transition_pstate(data, pstate);
-	freqs.new = find_khz_freq_from_pstate(data->powernow_table, pstate);
-
-	for_each_cpu(i, data->available_cores) {
-		freqs.cpu = i;
-		cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
-	}
-	return res;
-}
-
 /* Driver entry point to switch to the target frequency */
 static int powernowk8_target(struct cpufreq_policy *pol,
 		unsigned targfreq, unsigned relation)
@@ -1180,18 +1019,15 @@ static int powernowk8_target(struct cpufreq_policy *pol,
 	if (query_current_values_with_pending_wait(data))
 		goto err_out;
 
-	if (cpu_family != CPU_HW_PSTATE) {
-		pr_debug("targ: curr fid 0x%x, vid 0x%x\n",
-		data->currfid, data->currvid);
+	pr_debug("targ: curr fid 0x%x, vid 0x%x\n",
+		 data->currfid, data->currvid);
 
-		if ((checkvid != data->currvid) ||
-		    (checkfid != data->currfid)) {
-			printk(KERN_INFO PFX
-				"error - out of sync, fix 0x%x 0x%x, "
-				"vid 0x%x 0x%x\n",
-				checkfid, data->currfid,
-				checkvid, data->currvid);
-		}
+	if ((checkvid != data->currvid) ||
+	    (checkfid != data->currfid)) {
+		pr_info(PFX
+		       "error - out of sync, fix 0x%x 0x%x, vid 0x%x 0x%x\n",
+		       checkfid, data->currfid,
+		       checkvid, data->currvid);
 	}
 
 	if (cpufreq_frequency_table_target(pol, data->powernow_table,
@@ -1202,11 +1038,8 @@ static int powernowk8_target(struct cpufreq_policy *pol,
 
 	powernow_k8_acpi_pst_values(data, newstate);
 
-	if (cpu_family == CPU_HW_PSTATE)
-		ret = transition_frequency_pstate(data,
-			data->powernow_table[newstate].index);
-	else
-		ret = transition_frequency_fidvid(data, newstate);
+	ret = transition_frequency_fidvid(data, newstate);
+
 	if (ret) {
 		printk(KERN_ERR PFX "transition frequency failed\n");
 		ret = 1;
@@ -1215,11 +1048,7 @@ static int powernowk8_target(struct cpufreq_policy *pol,
 	}
 	mutex_unlock(&fidvid_mutex);
 
-	if (cpu_family == CPU_HW_PSTATE)
-		pol->cur = find_khz_freq_from_pstate(data->powernow_table,
-				data->powernow_table[newstate].index);
-	else
-		pol->cur = find_khz_freq_from_fid(data->currfid);
+	pol->cur = find_khz_freq_from_fid(data->currfid);
 	ret = 0;
 
 err_out:
@@ -1259,8 +1088,7 @@ static void __cpuinit powernowk8_cpu_init_on_cpu(void *_init_on_cpu)
 		return;
 	}
 
-	if (cpu_family == CPU_OPTERON)
-		fidvid_msr_init();
+	fidvid_msr_init();
 
 	init_on_cpu->rc = 0;
 }
@@ -1274,7 +1102,6 @@ static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol)
 	struct powernow_k8_data *data;
 	struct init_on_cpu init_on_cpu;
 	int rc;
-	struct cpuinfo_x86 *c = &cpu_data(pol->cpu);
 
 	if (!cpu_online(pol->cpu))
 		return -ENODEV;
@@ -1290,7 +1117,6 @@ static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol)
 	}
 
 	data->cpu = pol->cpu;
-	data->currpstate = HW_PSTATE_INVALID;
 
 	if (powernow_k8_cpu_init_acpi(data)) {
 		/*
@@ -1327,17 +1153,10 @@ static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol)
 	if (rc != 0)
 		goto err_out_exit_acpi;
 
-	if (cpu_family == CPU_HW_PSTATE)
-		cpumask_copy(pol->cpus, cpumask_of(pol->cpu));
-	else
-		cpumask_copy(pol->cpus, cpu_core_mask(pol->cpu));
+	cpumask_copy(pol->cpus, cpu_core_mask(pol->cpu));
 	data->available_cores = pol->cpus;
 
-	if (cpu_family == CPU_HW_PSTATE)
-		pol->cur = find_khz_freq_from_pstate(data->powernow_table,
-				data->currpstate);
-	else
-		pol->cur = find_khz_freq_from_fid(data->currfid);
+	pol->cur = find_khz_freq_from_fid(data->currfid);
 	pr_debug("policy current frequency %d kHz\n", pol->cur);
 
 	/* min/max the cpu is capable of */
@@ -1349,18 +1168,10 @@ static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol)
 		return -EINVAL;
 	}
 
-	/* Check for APERF/MPERF support in hardware */
-	if (cpu_has(c, X86_FEATURE_APERFMPERF))
-		cpufreq_amd64_driver.getavg = cpufreq_get_measured_perf;
-
 	cpufreq_frequency_table_get_attr(data->powernow_table, pol->cpu);
 
-	if (cpu_family == CPU_HW_PSTATE)
-		pr_debug("cpu_init done, current pstate 0x%x\n",
-				data->currpstate);
-	else
-		pr_debug("cpu_init done, current fid 0x%x, vid 0x%x\n",
-			data->currfid, data->currvid);
+	pr_debug("cpu_init done, current fid 0x%x, vid 0x%x\n",
+		 data->currfid, data->currvid);
 
 	per_cpu(powernow_data, pol->cpu) = data;
 
@@ -1413,88 +1224,15 @@ static unsigned int powernowk8_get(unsigned int cpu)
 	if (err)
 		goto out;
 
-	if (cpu_family == CPU_HW_PSTATE)
-		khz = find_khz_freq_from_pstate(data->powernow_table,
-						data->currpstate);
-	else
-		khz = find_khz_freq_from_fid(data->currfid);
+	khz = find_khz_freq_from_fid(data->currfid);
 
 
 out:
 	return khz;
 }
 
-static void _cpb_toggle_msrs(bool t)
-{
-	int cpu;
-
-	get_online_cpus();
-
-	rdmsr_on_cpus(cpu_online_mask, MSR_K7_HWCR, msrs);
-
-	for_each_cpu(cpu, cpu_online_mask) {
-		struct msr *reg = per_cpu_ptr(msrs, cpu);
-		if (t)
-			reg->l &= ~BIT(25);
-		else
-			reg->l |= BIT(25);
-	}
-	wrmsr_on_cpus(cpu_online_mask, MSR_K7_HWCR, msrs);
-
-	put_online_cpus();
-}
-
-/*
- * Switch on/off core performance boosting.
- *
- * 0=disable
- * 1=enable.
- */
-static void cpb_toggle(bool t)
-{
-	if (!cpb_capable)
-		return;
-
-	if (t && !cpb_enabled) {
-		cpb_enabled = true;
-		_cpb_toggle_msrs(t);
-		printk(KERN_INFO PFX "Core Boosting enabled.\n");
-	} else if (!t && cpb_enabled) {
-		cpb_enabled = false;
-		_cpb_toggle_msrs(t);
-		printk(KERN_INFO PFX "Core Boosting disabled.\n");
-	}
-}
-
-static ssize_t store_cpb(struct cpufreq_policy *policy, const char *buf,
-				 size_t count)
-{
-	int ret = -EINVAL;
-	unsigned long val = 0;
-
-	ret = strict_strtoul(buf, 10, &val);
-	if (!ret && (val == 0 || val == 1) && cpb_capable)
-		cpb_toggle(val);
-	else
-		return -EINVAL;
-
-	return count;
-}
-
-static ssize_t show_cpb(struct cpufreq_policy *policy, char *buf)
-{
-	return sprintf(buf, "%u\n", cpb_enabled);
-}
-
-#define define_one_rw(_name) \
-static struct freq_attr _name = \
-__ATTR(_name, 0644, show_##_name, store_##_name)
-
-define_one_rw(cpb);
-
 static struct freq_attr *powernow_k8_attr[] = {
 	&cpufreq_freq_attr_scaling_available_freqs,
-	&cpb,
 	NULL,
 };
 
@@ -1510,58 +1248,20 @@ static struct cpufreq_driver cpufreq_amd64_driver = {
 	.attr		= powernow_k8_attr,
 };
 
-/*
- * Clear the boost-disable flag on the CPU_DOWN path so that this cpu
- * cannot block the remaining ones from boosting. On the CPU_UP path we
- * simply keep the boost-disable flag in sync with the current global
- * state.
- */
-static int cpb_notify(struct notifier_block *nb, unsigned long action,
-		      void *hcpu)
-{
-	unsigned cpu = (long)hcpu;
-	u32 lo, hi;
-
-	switch (action) {
-	case CPU_UP_PREPARE:
-	case CPU_UP_PREPARE_FROZEN:
-
-		if (!cpb_enabled) {
-			rdmsr_on_cpu(cpu, MSR_K7_HWCR, &lo, &hi);
-			lo |= BIT(25);
-			wrmsr_on_cpu(cpu, MSR_K7_HWCR, lo, hi);
-		}
-		break;
-
-	case CPU_DOWN_PREPARE:
-	case CPU_DOWN_PREPARE_FROZEN:
-		rdmsr_on_cpu(cpu, MSR_K7_HWCR, &lo, &hi);
-		lo &= ~BIT(25);
-		wrmsr_on_cpu(cpu, MSR_K7_HWCR, lo, hi);
-		break;
-
-	default:
-		break;
-	}
-
-	return NOTIFY_OK;
-}
-
-static struct notifier_block cpb_nb = {
-	.notifier_call		= cpb_notify,
-};
-
 /* driver entry point for init */
 static int __cpuinit powernowk8_init(void)
 {
-	unsigned int i, supported_cpus = 0, cpu;
+	unsigned int i, supported_cpus = 0;
 	int rv;
 
-	if (!x86_match_cpu(powernow_k8_ids))
+	if (static_cpu_has(X86_FEATURE_HW_PSTATE)) {
+		pr_warn(PFX "this CPU is not supported anymore, using acpi-cpufreq instead.\n");
+		request_module("acpi-cpufreq");
 		return -ENODEV;
+	}
 
-	if (static_cpu_has(X86_FEATURE_HW_PSTATE))
-		pr_warn(PFX "support for this CPU is deprecated, use acpi-cpufreq instead.\n");
+	if (!x86_match_cpu(powernow_k8_ids))
+		return -ENODEV;
 
 	for_each_online_cpu(i) {
 		int rc;
@@ -1573,26 +1273,6 @@ static int __cpuinit powernowk8_init(void)
 	if (supported_cpus != num_online_cpus())
 		return -ENODEV;
 
-	if (boot_cpu_has(X86_FEATURE_CPB)) {
-
-		cpb_capable = true;
-
-		msrs = msrs_alloc();
-		if (!msrs) {
-			printk(KERN_ERR "%s: Error allocating msrs!\n", __func__);
-			return -ENOMEM;
-		}
-
-		register_cpu_notifier(&cpb_nb);
-
-		rdmsr_on_cpus(cpu_online_mask, MSR_K7_HWCR, msrs);
-
-		for_each_cpu(cpu, cpu_online_mask) {
-			struct msr *reg = per_cpu_ptr(msrs, cpu);
-			cpb_enabled |= !(!!(reg->l & BIT(25)));
-		}
-	}
-
 	rv = cpufreq_register_driver(&cpufreq_amd64_driver);
 
 	if (!rv)
@@ -1600,15 +1280,6 @@ static int __cpuinit powernowk8_init(void)
 			num_online_nodes(), boot_cpu_data.x86_model_id,
 			supported_cpus);
 
-	if (boot_cpu_has(X86_FEATURE_CPB)) {
-		if (rv < 0) {
-			unregister_cpu_notifier(&cpb_nb);
-			msrs_free(msrs);
-			msrs = NULL;
-		} else
-			pr_info(PFX "Core Performance Boosting: %s.\n",
-				(cpb_enabled ? "on" : "off"));
-	}
 	return rv;
 }
 
@@ -1617,13 +1288,6 @@ static void __exit powernowk8_exit(void)
 {
 	pr_debug("exit\n");
 
-	if (boot_cpu_has(X86_FEATURE_CPB)) {
-		msrs_free(msrs);
-		msrs = NULL;
-
-		unregister_cpu_notifier(&cpb_nb);
-	}
-
 	cpufreq_unregister_driver(&cpufreq_amd64_driver);
 }
 
diff --git a/drivers/cpufreq/powernow-k8.h b/drivers/cpufreq/powernow-k8.h
index 3744d26..79329d4 100644
--- a/drivers/cpufreq/powernow-k8.h
+++ b/drivers/cpufreq/powernow-k8.h
@@ -5,24 +5,11 @@
  *  http://www.gnu.org/licenses/gpl.html
  */
 
-enum pstate {
-	HW_PSTATE_INVALID = 0xff,
-	HW_PSTATE_0 = 0,
-	HW_PSTATE_1 = 1,
-	HW_PSTATE_2 = 2,
-	HW_PSTATE_3 = 3,
-	HW_PSTATE_4 = 4,
-	HW_PSTATE_5 = 5,
-	HW_PSTATE_6 = 6,
-	HW_PSTATE_7 = 7,
-};
-
 struct powernow_k8_data {
 	unsigned int cpu;
 
 	u32 numps;  /* number of p-states */
 	u32 batps;  /* number of p-states supported on battery */
-	u32 max_hw_pstate; /* maximum legal hardware pstate */
 
 	/* these values are constant when the PSB is used to determine
 	 * vid/fid pairings, but are modified during the ->target() call
@@ -37,7 +24,6 @@ struct powernow_k8_data {
 	/* keep track of the current fid / vid or pstate */
 	u32 currvid;
 	u32 currfid;
-	enum pstate currpstate;
 
 	/* the powernow_table includes all frequency and vid/fid pairings:
 	 * fid are the lower 8 bits of the index, vid are the upper 8 bits.
@@ -97,23 +83,6 @@ struct powernow_k8_data {
 #define MSR_S_HI_CURRENT_VID      0x0000003f
 #define MSR_C_HI_STP_GNT_BENIGN	  0x00000001
 
-
-/* Hardware Pstate _PSS and MSR definitions */
-#define USE_HW_PSTATE		0x00000080
-#define HW_PSTATE_MASK 		0x00000007
-#define HW_PSTATE_VALID_MASK 	0x80000000
-#define HW_PSTATE_MAX_MASK	0x000000f0
-#define HW_PSTATE_MAX_SHIFT	4
-#define MSR_PSTATE_DEF_BASE 	0xc0010064 /* base of Pstate MSRs */
-#define MSR_PSTATE_STATUS 	0xc0010063 /* Pstate Status MSR */
-#define MSR_PSTATE_CTRL 	0xc0010062 /* Pstate control MSR */
-#define MSR_PSTATE_CUR_LIMIT	0xc0010061 /* pstate current limit MSR */
-
-/* define the two driver architectures */
-#define CPU_OPTERON 0
-#define CPU_HW_PSTATE 1
-
-
 /*
  * There are restrictions frequencies have to follow:
  * - only 1 entry in the low fid table ( <=1.4GHz )
@@ -218,5 +187,4 @@ static int core_frequency_transition(struct powernow_k8_data *data, u32 reqfid);
 
 static void powernow_k8_acpi_pst_values(struct powernow_k8_data *data, unsigned int index);
 
-static int fill_powernow_table_pstate(struct powernow_k8_data *data, struct cpufreq_frequency_table *powernow_table);
 static int fill_powernow_table_fidvid(struct powernow_k8_data *data, struct cpufreq_frequency_table *powernow_table);
-- 
1.7.12



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq
  2012-09-04  8:28 [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq Andre Przywara
                   ` (7 preceding siblings ...)
  2012-09-04  8:28 ` [PATCH 8/8 v2] cpufreq: Remove support for hardware P-state chips from powernow-k8 Andre Przywara
@ 2012-09-05 13:46 ` Rafael J. Wysocki
  2012-09-05 14:25   ` Thomas Renninger
  8 siblings, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2012-09-05 13:46 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Thomas Renninger, Matthew Garret, linux-pm, cpufreq,
	linux-kernel, Andreas Herrmann

On Tuesday, September 04, 2012, Andre Przywara wrote:
> Hi,
> 
> now the second, revised version of the patch set. I now tested loading
> both drivers after each other in several combinations, after two bug
> fixes this now works as expected.
> I added a patch to move messages from powernow-k8 after the initialization
> phase, so it remains silent if driver loading fails.
> 
> I also rearranged the patches so that the powernow-k8 feature removal patch is
> now the last one and is somewhat optional. I still prefer to have it in,
> but it can be dropped if needed. Then powernow-k8 will still support modern
> AMD CPUs, but will emit a warning message.
> 
> Regards,
> Andre
> 
> Changes from v1:
> * added hints to Kconfig about CPU support
> * merge documentation from separate patch into the feature patch
> * add deprecation warning
> * prefix acpi-cpufreq warning messages with module name
> * bugfix: avoid boost init it driver registration failed
> * bugfix: fix module redirect request call (was not reached before)

I have applied the whole series to the linux-next branch of the linux-pm.git
tree, but I'm quite unsure about [7/8].  Is it really necessary?  I mean, since
we want users to switch to a different driver anyway, which already has a knob
providing the functionality in question, what's the point really?

Rafael

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq
  2012-09-05 13:46 ` [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq Rafael J. Wysocki
@ 2012-09-05 14:25   ` Thomas Renninger
  2012-09-05 15:05     ` Andre Przywara
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Renninger @ 2012-09-05 14:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andre Przywara, Matthew Garret, linux-pm, cpufreq, linux-kernel,
	Andreas Herrmann

On Wednesday, September 05, 2012 03:46:22 PM Rafael J. Wysocki wrote:
> On Tuesday, September 04, 2012, Andre Przywara wrote:
> > Hi,
> > 
> > now the second, revised version of the patch set. I now tested loading
> > both drivers after each other in several combinations, after two bug
> > fixes this now works as expected.
> > I added a patch to move messages from powernow-k8 after the initialization
> > phase, so it remains silent if driver loading fails.
> > 
> > I also rearranged the patches so that the powernow-k8 feature removal patch is
> > now the last one and is somewhat optional. I still prefer to have it in,
> > but it can be dropped if needed. Then powernow-k8 will still support modern
> > AMD CPUs, but will emit a warning message.
> > 
> > Regards,
> > Andre
> > 
> > Changes from v1:
> > * added hints to Kconfig about CPU support
> > * merge documentation from separate patch into the feature patch
> > * add deprecation warning
> > * prefix acpi-cpufreq warning messages with module name
> > * bugfix: avoid boost init it driver registration failed
> > * bugfix: fix module redirect request call (was not reached before)
> 
> I have applied the whole series to the linux-next branch of the linux-pm.git
> tree, but I'm quite unsure about [7/8].  Is it really necessary?  I mean, since
> we want users to switch to a different driver anyway, which already has a knob
> providing the functionality in question, what's the point really?

Afaik this was mostly for compiler people or other developers doing
benchmarks to be able to avoid fluctating numbers caused by turbo/boost
mode.

I don't know whether it's used by any userspace tool. If, it would certainly
be an AMD specific tool for developers. If you (AMD) know a userspace tool
out there it may make sense to keep it the one or other kernel round,
otherwise I guess it can simply be deleted.

I will add the functionality (to enable/disable turbo/boost mode)
to cpupower userspace tool as soon as this is in.
Shouldn't be much more than a -t option and a
one liner to read out or set another cpufreq sysfs file.

   Thomas

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq
  2012-09-05 14:25   ` Thomas Renninger
@ 2012-09-05 15:05     ` Andre Przywara
  0 siblings, 0 replies; 14+ messages in thread
From: Andre Przywara @ 2012-09-05 15:05 UTC (permalink / raw)
  To: Thomas Renninger
  Cc: Rafael J. Wysocki, Matthew Garret, linux-pm, cpufreq,
	linux-kernel, Andreas Herrmann

On 09/05/2012 04:25 PM, Thomas Renninger wrote:
> On Wednesday, September 05, 2012 03:46:22 PM Rafael J. Wysocki wrote:
>> On Tuesday, September 04, 2012, Andre Przywara wrote:
>>> Hi,
>>>
>>
>> I have applied the whole series to the linux-next branch of the linux-pm.git

Thanks!

>> tree, but I'm quite unsure about [7/8].  Is it really necessary?  I mean, since
>> we want users to switch to a different driver anyway, which already has a knob
>> providing the functionality in question, what's the point really?
>
> Afaik this was mostly for compiler people or other developers doing
> benchmarks to be able to avoid fluctating numbers caused by turbo/boost
> mode.
>
> I don't know whether it's used by any userspace tool.

Well, at least we use it here in quite some scripts, both for testing 
and for benchmarking.
As this has been in for 2.5 year by now, I'd like to keep it for one or 
two more major releases and then remove it.

> If, it would certainly
> be an AMD specific tool for developers. If you (AMD) know a userspace tool
> out there it may make sense to keep it the one or other kernel round,
> otherwise I guess it can simply be deleted.
>
> I will add the functionality (to enable/disable turbo/boost mode)
> to cpupower userspace tool as soon as this is in.
> Shouldn't be much more than a -t option and a
> one liner to read out or set another cpufreq sysfs file.

Thanks, that would be nice.

Regards,
Andre.

-- 
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/8 v2] acpi-cpufreq: Add quirk to disable _PSD usage on all AMD CPUs
       [not found]   ` <CACJDEmrOoZ_azqTrtfoGC-O=H0V=AHy8VExoFYYPhbUUx+7UAg@mail.gmail.com>
@ 2012-09-17  7:41     ` Andre Przywara
  2012-09-17 11:40       ` Thomas Renninger
  0 siblings, 1 reply; 14+ messages in thread
From: Andre Przywara @ 2012-09-17  7:41 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Andreas Herrmann, Thomas Renninger, Rafael J. Wysocki, cpufreq,
	linux-pm, linux-kernel, Matthew Garret

On 09/15/2012 01:20 PM, Konrad Rzeszutek Wilk wrote:
>
> On Sep 4, 2012 4:26 AM, "Andre Przywara" <andre.przywara@amd.com
> <mailto:andre.przywara@amd.com>> wrote:
>  >
>  > To workaround some Windows specific behavior, the ACPI _PSD table
>  > on AMD desktop boards advertises all cores as dependent, meaning
>  > that they all can only use the same P-state. acpi-cpufreq strictly
>  > obeys this description, instantiating one CPU only and symlinking
>  > the others. But the hardware can have distinct frequencies for each
>  > core and powernow-k8 did it that way.
>  > So, in order to use the hardware to its full potential and keep the
>  > original powernow-k8 behavior, lets override the _PSD table setting
>  > on AMD hardware.
>
> Can you tell on which motherboards this different freq on cores can be
> exposed?

Actually I think the motherboard is not the limiting factor here.
Per-core P-states are possible on AMD Family 10h and greater CPUs, so 
Phenoms/Barcelona and beyond. You may need an AM2+/AM3 board, though. 
Fam10h CPUs run in some older AM2-only boards, but the power-management 
capabilities may be limited here. This definitely applies to the voltage 
(single power plane only), but the older BIOS on these boards may limit 
the frequency switching also.

The actual reason for this patch is the (Fam10h) BKDG [1] recommendation 
in chapter 2.4.2.13.4 _PSD (P-State Dependency). There advertising only 
one possible P-state per package is recommended - for desktop CPUs 
(which are supposed to run Windows ;-)
This was to overcome some nasty interaction between the Windows 
scheduler and their version of the ondemand governor.

Regards,
Andre.

[1] http://support.amd.com/us/Processor_TechDocs/31116.pdf

-- 
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/8 v2] acpi-cpufreq: Add quirk to disable _PSD usage on all AMD CPUs
  2012-09-17  7:41     ` Andre Przywara
@ 2012-09-17 11:40       ` Thomas Renninger
  0 siblings, 0 replies; 14+ messages in thread
From: Thomas Renninger @ 2012-09-17 11:40 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Konrad Rzeszutek Wilk, Andreas Herrmann, Rafael J. Wysocki,
	cpufreq, linux-pm, linux-kernel, Matthew Garret

On Monday, September 17, 2012 09:41:20 AM Andre Przywara wrote:
> On 09/15/2012 01:20 PM, Konrad Rzeszutek Wilk wrote:
> >
...
> This was to overcome some nasty interaction between the Windows 
> scheduler and their version of the ondemand governor.
Whoever is/was responsible for this, can you explain him/her that
this was a bad idea and why.

Is this part of a BKDG?
Can you point to a public spec and the exact wording of the
"Windows scheduler workaround" BIOS vendors shall do?

> +               pr_info_once(PFX "overriding BIOS provided _PSD data\n");
The message shows up on nearly every platform wether a _PSD
function exists or not. This is wrong.

If it's _PSD info that should get ignored/overwritten, this should
be done where _PSD is obtained:
processor_perflib.c

Are you sure that it will never make sense for AMD to make use of
_PSD tables?
If yes, then always ignoring might be an option.

If not, this might need a more specific check, e.g.:
   - Latest Windows version support called via OSI interface?
        Latest Windowses should/may not need this anymore?
   - Check for Desktop CPUs that are affected by the bad spec?

Hm, as powernow-k8 never made use of _PSD, ignoring it for
now sounds like a good thing to do. Still the ignoring should get
moved to processor_perflib.c, best with a pointer or at least
a comment that _PSD can be dangerous on AMD platforms. At some
day _PSD may make sense for AMD platforms as well?


    Thomas

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-09-17 11:40 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-04  8:28 [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq Andre Przywara
2012-09-04  8:28 ` [PATCH 1/8 v2] acpi-cpufreq: Add support for modern AMD CPUs Andre Przywara
2012-09-04  8:28 ` [PATCH 2/8 v2] acpi-cpufreq: Add quirk to disable _PSD usage on all " Andre Przywara
     [not found]   ` <CACJDEmrOoZ_azqTrtfoGC-O=H0V=AHy8VExoFYYPhbUUx+7UAg@mail.gmail.com>
2012-09-17  7:41     ` Andre Przywara
2012-09-17 11:40       ` Thomas Renninger
2012-09-04  8:28 ` [PATCH 3/8 v2] cpufreq: Add warning message to powernow-k8 Andre Przywara
2012-09-04  8:28 ` [PATCH 4/8 v2] powernow-k8: delay info messages until initialization has succeeded Andre Przywara
2012-09-04  8:28 ` [PATCH 5/8 v2] ACPI: Add fixups for AMD P-state figures Andre Przywara
2012-09-04  8:28 ` [PATCH 6/8 v2] acpi-cpufreq: Add support for disabling dynamic overclocking Andre Przywara
2012-09-04  8:28 ` [PATCH 7/8 v2] acpi-cpufreq: Add compatibility for legacy AMD cpb sysfs knob Andre Przywara
2012-09-04  8:28 ` [PATCH 8/8 v2] cpufreq: Remove support for hardware P-state chips from powernow-k8 Andre Przywara
2012-09-05 13:46 ` [PATCH 0/8 v2] acpi-cpufreq: Move modern AMD cpufreq support to acpi-cpufreq Rafael J. Wysocki
2012-09-05 14:25   ` Thomas Renninger
2012-09-05 15:05     ` Andre Przywara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).