All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] x86/cpufeature: Cleanup stuff
@ 2015-11-10 11:48 Borislav Petkov
  2015-11-10 11:48 ` [RFC PATCH 1/3] x86/cpufeature: Move some of the scattered feature bits to x86_capability Borislav Petkov
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Borislav Petkov @ 2015-11-10 11:48 UTC (permalink / raw)
  To: X86 ML; +Cc: LKML, Peter Zijlstra, Andy Lutomirski

From: Borislav Petkov <bp@suse.de>

Hi all,

so this should take care of cleaning up some aspects of our cpufeatures
handling.

Patches should be pretty self-explanatory but let me send them out as
an RFC - I might've missed something obvious of the sort "but but, you
can't do that..."

Thanks.

Borislav Petkov (3):
  x86/cpufeature: Move some of the scattered feature bits to
    x86_capability
  x86/cpufeature: Cleanup get_cpu_cap()
  x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros

 arch/x86/crypto/chacha20_glue.c             |   2 +-
 arch/x86/crypto/crc32c-intel_glue.c         |   3 +-
 arch/x86/include/asm/cmpxchg_32.h           |   2 +-
 arch/x86/include/asm/cpufeature.h           | 106 +++++++++++++++-------------
 arch/x86/include/asm/smp.h                  |   2 +-
 arch/x86/kernel/cpu/amd.c                   |   2 +-
 arch/x86/kernel/cpu/centaur.c               |   2 +-
 arch/x86/kernel/cpu/common.c                |  48 +++++++------
 arch/x86/kernel/cpu/intel.c                 |   3 +-
 arch/x86/kernel/cpu/mtrr/generic.c          |   2 +-
 arch/x86/kernel/cpu/mtrr/main.c             |   2 +-
 arch/x86/kernel/cpu/perf_event_amd.c        |   4 +-
 arch/x86/kernel/cpu/perf_event_amd_uncore.c |   8 +--
 arch/x86/kernel/cpu/scattered.c             |  20 ------
 arch/x86/kernel/cpu/transmeta.c             |   4 +-
 arch/x86/kernel/fpu/init.c                  |   4 +-
 arch/x86/kernel/hw_breakpoint.c             |   3 +-
 arch/x86/kernel/vm86_32.c                   |   4 +-
 arch/x86/mm/setup_nx.c                      |   4 +-
 drivers/char/hw_random/via-rng.c            |   5 +-
 drivers/crypto/padlock-aes.c                |   2 +-
 drivers/crypto/padlock-sha.c                |   3 +-
 fs/btrfs/disk-io.c                          |   2 +-
 23 files changed, 115 insertions(+), 122 deletions(-)

-- 
2.3.5


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH 1/3] x86/cpufeature: Move some of the scattered feature bits to x86_capability
  2015-11-10 11:48 [RFC PATCH 0/3] x86/cpufeature: Cleanup stuff Borislav Petkov
@ 2015-11-10 11:48 ` Borislav Petkov
  2015-11-10 11:48 ` [RFC PATCH 2/3] x86/cpufeature: Cleanup get_cpu_cap() Borislav Petkov
  2015-11-10 11:48 ` [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros Borislav Petkov
  2 siblings, 0 replies; 16+ messages in thread
From: Borislav Petkov @ 2015-11-10 11:48 UTC (permalink / raw)
  To: X86 ML; +Cc: LKML, Peter Zijlstra, Andy Lutomirski

From: Borislav Petkov <bp@suse.de>

Turn the CPUID leafs which are proper feature bit leafs into separate
->x86_capability words.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/cpufeature.h | 54 +++++++++++++++++++++++----------------
 arch/x86/kernel/cpu/common.c      |  5 ++++
 arch/x86/kernel/cpu/scattered.c   | 20 ---------------
 3 files changed, 37 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index e4f8010f22e0..13d78e0e6ae0 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -12,7 +12,7 @@
 #include <asm/disabled-features.h>
 #endif
 
-#define NCAPINTS	14	/* N 32-bit words worth of info */
+#define NCAPINTS	16	/* N 32-bit words worth of info */
 #define NBUGINTS	1	/* N 32-bit bug flags */
 
 /*
@@ -181,22 +181,17 @@
 
 /*
  * Auxiliary flags: Linux defined - For features scattered in various
- * CPUID levels like 0x6, 0xA etc, word 7
+ * CPUID levels like 0x6, 0xA etc, word 7.
+ *
+ * Reuse free bits when adding new feature flags!
  */
-#define X86_FEATURE_IDA		( 7*32+ 0) /* Intel Dynamic Acceleration */
-#define X86_FEATURE_ARAT	( 7*32+ 1) /* Always Running APIC Timer */
+
 #define X86_FEATURE_CPB		( 7*32+ 2) /* AMD Core Performance Boost */
 #define X86_FEATURE_EPB		( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
-#define X86_FEATURE_PLN		( 7*32+ 5) /* Intel Power Limit Notification */
-#define X86_FEATURE_PTS		( 7*32+ 6) /* Intel Package Thermal Status */
-#define X86_FEATURE_DTHERM	( 7*32+ 7) /* Digital Thermal Sensor */
+
 #define X86_FEATURE_HW_PSTATE	( 7*32+ 8) /* AMD HW-PState */
 #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */
-#define X86_FEATURE_HWP		( 7*32+ 10) /* "hwp" Intel HWP */
-#define X86_FEATURE_HWP_NOTIFY	( 7*32+ 11) /* Intel HWP_NOTIFY */
-#define X86_FEATURE_HWP_ACT_WINDOW ( 7*32+ 12) /* Intel HWP_ACT_WINDOW */
-#define X86_FEATURE_HWP_EPP	( 7*32+13) /* Intel HWP_EPP */
-#define X86_FEATURE_HWP_PKG_REQ ( 7*32+14) /* Intel HWP_PKG_REQ */
+
 #define X86_FEATURE_INTEL_PT	( 7*32+15) /* Intel Processor Trace */
 
 /* Virtualization flags: Linux defined, word 8 */
@@ -205,16 +200,7 @@
 #define X86_FEATURE_FLEXPRIORITY ( 8*32+ 2) /* Intel FlexPriority */
 #define X86_FEATURE_EPT         ( 8*32+ 3) /* Intel Extended Page Table */
 #define X86_FEATURE_VPID        ( 8*32+ 4) /* Intel Virtual Processor ID */
-#define X86_FEATURE_NPT		( 8*32+ 5) /* AMD Nested Page Table support */
-#define X86_FEATURE_LBRV	( 8*32+ 6) /* AMD LBR Virtualization support */
-#define X86_FEATURE_SVML	( 8*32+ 7) /* "svm_lock" AMD SVM locking MSR */
-#define X86_FEATURE_NRIPS	( 8*32+ 8) /* "nrip_save" AMD SVM next_rip save */
-#define X86_FEATURE_TSCRATEMSR  ( 8*32+ 9) /* "tsc_scale" AMD TSC scaling support */
-#define X86_FEATURE_VMCBCLEAN   ( 8*32+10) /* "vmcb_clean" AMD VMCB clean bits support */
-#define X86_FEATURE_FLUSHBYASID ( 8*32+11) /* AMD flush-by-ASID support */
-#define X86_FEATURE_DECODEASSISTS ( 8*32+12) /* AMD Decode Assists support */
-#define X86_FEATURE_PAUSEFILTER ( 8*32+13) /* AMD filtered pause intercept */
-#define X86_FEATURE_PFTHRESHOLD ( 8*32+14) /* AMD pause filter threshold */
+
 #define X86_FEATURE_VMMCALL     ( 8*32+15) /* Prefer vmmcall to vmcall */
 
 
@@ -258,6 +244,30 @@
 /* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */
 #define X86_FEATURE_CLZERO	(13*32+0) /* CLZERO instruction */
 
+/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */
+#define X86_FEATURE_DTHERM	(14*32+ 0) /* Digital Thermal Sensor */
+#define X86_FEATURE_IDA		(14*32+ 1) /* Intel Dynamic Acceleration */
+#define X86_FEATURE_ARAT	(14*32+ 2) /* Always Running APIC Timer */
+#define X86_FEATURE_PLN		(14*32+ 4) /* Intel Power Limit Notification */
+#define X86_FEATURE_PTS		(14*32+ 6) /* Intel Package Thermal Status */
+#define X86_FEATURE_HWP		(14*32+ 7) /* Intel Hardware P-states */
+#define X86_FEATURE_HWP_NOTIFY	(14*32+ 8) /* HWP Notification */
+#define X86_FEATURE_HWP_ACT_WINDOW (14*32+ 9) /* HWP Activity Window */
+#define X86_FEATURE_HWP_EPP	(14*32+10) /* HWP Energy Perf. Preference */
+#define X86_FEATURE_HWP_PKG_REQ (14*32+11) /* HWP Package Level Request */
+
+/* AMD SVM Feature Identification, CPUID level 0x8000000a (edx), word 15 */
+#define X86_FEATURE_NPT		(15*32+ 0) /* Nested Page Table support */
+#define X86_FEATURE_LBRV	(15*32+ 1) /* LBR Virtualization support */
+#define X86_FEATURE_SVML	(15*32+ 2) /* "svm_lock" SVM locking MSR */
+#define X86_FEATURE_NRIPS	(15*32+ 3) /* "nrip_save" SVM next_rip save */
+#define X86_FEATURE_TSCRATEMSR  (15*32+ 4) /* "tsc_scale" TSC scaling support */
+#define X86_FEATURE_VMCBCLEAN   (15*32+ 5) /* "vmcb_clean" VMCB clean bits support */
+#define X86_FEATURE_FLUSHBYASID (15*32+ 6) /* flush-by-ASID support */
+#define X86_FEATURE_DECODEASSISTS (15*32+ 7) /* Decode Assists support */
+#define X86_FEATURE_PAUSEFILTER (15*32+10) /* filtered pause intercept */
+#define X86_FEATURE_PFTHRESHOLD (15*32+12) /* pause filter threshold */
+
 /*
  * BUG word(s)
  */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 4ddd780aeac9..3e7ed149082b 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -619,6 +619,8 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
 		cpuid_count(0x00000007, 0, &eax, &ebx, &ecx, &edx);
 
 		c->x86_capability[9] = ebx;
+
+		c->x86_capability[14] = cpuid_eax(0x00000006);
 	}
 
 	/* Extended state features: level 0x0000000d */
@@ -680,6 +682,9 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
 	if (c->extended_cpuid_level >= 0x80000007)
 		c->x86_power = cpuid_edx(0x80000007);
 
+	if (c->extended_cpuid_level >= 0x8000000a)
+		c->x86_capability[15] = cpuid_edx(0x8000000a);
+
 	init_scattered_cpuid_features(c);
 }
 
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 608fb26c7254..8cb57df9398d 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -31,32 +31,12 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
 	const struct cpuid_bit *cb;
 
 	static const struct cpuid_bit cpuid_bits[] = {
-		{ X86_FEATURE_DTHERM,		CR_EAX, 0, 0x00000006, 0 },
-		{ X86_FEATURE_IDA,		CR_EAX, 1, 0x00000006, 0 },
-		{ X86_FEATURE_ARAT,		CR_EAX, 2, 0x00000006, 0 },
-		{ X86_FEATURE_PLN,		CR_EAX, 4, 0x00000006, 0 },
-		{ X86_FEATURE_PTS,		CR_EAX, 6, 0x00000006, 0 },
-		{ X86_FEATURE_HWP,		CR_EAX, 7, 0x00000006, 0 },
-		{ X86_FEATURE_HWP_NOTIFY,	CR_EAX, 8, 0x00000006, 0 },
-		{ X86_FEATURE_HWP_ACT_WINDOW,	CR_EAX, 9, 0x00000006, 0 },
-		{ X86_FEATURE_HWP_EPP,		CR_EAX,10, 0x00000006, 0 },
-		{ X86_FEATURE_HWP_PKG_REQ,	CR_EAX,11, 0x00000006, 0 },
 		{ X86_FEATURE_INTEL_PT,		CR_EBX,25, 0x00000007, 0 },
 		{ X86_FEATURE_APERFMPERF,	CR_ECX, 0, 0x00000006, 0 },
 		{ X86_FEATURE_EPB,		CR_ECX, 3, 0x00000006, 0 },
 		{ X86_FEATURE_HW_PSTATE,	CR_EDX, 7, 0x80000007, 0 },
 		{ X86_FEATURE_CPB,		CR_EDX, 9, 0x80000007, 0 },
 		{ X86_FEATURE_PROC_FEEDBACK,	CR_EDX,11, 0x80000007, 0 },
-		{ X86_FEATURE_NPT,		CR_EDX, 0, 0x8000000a, 0 },
-		{ X86_FEATURE_LBRV,		CR_EDX, 1, 0x8000000a, 0 },
-		{ X86_FEATURE_SVML,		CR_EDX, 2, 0x8000000a, 0 },
-		{ X86_FEATURE_NRIPS,		CR_EDX, 3, 0x8000000a, 0 },
-		{ X86_FEATURE_TSCRATEMSR,	CR_EDX, 4, 0x8000000a, 0 },
-		{ X86_FEATURE_VMCBCLEAN,	CR_EDX, 5, 0x8000000a, 0 },
-		{ X86_FEATURE_FLUSHBYASID,	CR_EDX, 6, 0x8000000a, 0 },
-		{ X86_FEATURE_DECODEASSISTS,	CR_EDX, 7, 0x8000000a, 0 },
-		{ X86_FEATURE_PAUSEFILTER,	CR_EDX,10, 0x8000000a, 0 },
-		{ X86_FEATURE_PFTHRESHOLD,	CR_EDX,12, 0x8000000a, 0 },
 		{ 0, 0, 0, 0, 0 }
 	};
 
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 2/3] x86/cpufeature: Cleanup get_cpu_cap()
  2015-11-10 11:48 [RFC PATCH 0/3] x86/cpufeature: Cleanup stuff Borislav Petkov
  2015-11-10 11:48 ` [RFC PATCH 1/3] x86/cpufeature: Move some of the scattered feature bits to x86_capability Borislav Petkov
@ 2015-11-10 11:48 ` Borislav Petkov
  2015-11-10 11:48 ` [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros Borislav Petkov
  2 siblings, 0 replies; 16+ messages in thread
From: Borislav Petkov @ 2015-11-10 11:48 UTC (permalink / raw)
  To: X86 ML; +Cc: LKML, Peter Zijlstra, Andy Lutomirski

From: Borislav Petkov <bp@suse.de>

Add an enum for the ->x86_capability array indices and cleanup
get_cpu_cap() by killing some redundant local vars.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/cpufeature.h | 20 +++++++++++++++++
 arch/x86/kernel/cpu/centaur.c     |  2 +-
 arch/x86/kernel/cpu/common.c      | 47 ++++++++++++++++++---------------------
 arch/x86/kernel/cpu/transmeta.c   |  4 ++--
 4 files changed, 45 insertions(+), 28 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 13d78e0e6ae0..35401fef0d75 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -288,6 +288,26 @@
 #include <asm/asm.h>
 #include <linux/bitops.h>
 
+enum cpuid_leafs
+{
+	CPUID_1_EDX		= 0,
+	CPUID_8000_0001_EDX,
+	CPUID_8086_0001_EDX,
+	CPUID_LNX_1,
+	CPUID_1_ECX,
+	CPUID_C000_0001_EDX,
+	CPUID_8000_0001_ECX,
+	CPUID_LNX_2,
+	CPUID_LNX_3,
+	CPUID_7_0_EBX,
+	CPUID_D_1_EAX,
+	CPUID_F_0_EDX,
+	CPUID_F_1_EDX,
+	CPUID_8000_0008_EBX,
+	CPUID_6_EAX,
+	CPUID_8000_000A_EDX,
+};
+
 #ifdef CONFIG_X86_FEATURE_NAMES
 extern const char * const x86_cap_flags[NCAPINTS*32];
 extern const char * const x86_power_flags[32];
diff --git a/arch/x86/kernel/cpu/centaur.c b/arch/x86/kernel/cpu/centaur.c
index d8fba5c15fbd..ae20be6e483c 100644
--- a/arch/x86/kernel/cpu/centaur.c
+++ b/arch/x86/kernel/cpu/centaur.c
@@ -43,7 +43,7 @@ static void init_c3(struct cpuinfo_x86 *c)
 		/* store Centaur Extended Feature Flags as
 		 * word 5 of the CPU capability bit array
 		 */
-		c->x86_capability[5] = cpuid_edx(0xC0000001);
+		c->x86_capability[CPUID_C000_0001_EDX] = cpuid_edx(0xC0000001);
 	}
 #ifdef CONFIG_X86_32
 	/* Cyrix III family needs CX8 & PGE explicitly enabled. */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 3e7ed149082b..6b6a74ddd4fc 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -600,52 +600,47 @@ void cpu_detect(struct cpuinfo_x86 *c)
 
 void get_cpu_cap(struct cpuinfo_x86 *c)
 {
-	u32 tfms, xlvl;
-	u32 ebx;
+	u32 eax, ebx, ecx, edx;
 
 	/* Intel-defined flags: level 0x00000001 */
 	if (c->cpuid_level >= 0x00000001) {
-		u32 capability, excap;
+		cpuid(0x00000001, &eax, &ebx, &ecx, &edx);
 
-		cpuid(0x00000001, &tfms, &ebx, &excap, &capability);
-		c->x86_capability[0] = capability;
-		c->x86_capability[4] = excap;
+		c->x86_capability[CPUID_1_ECX] = ecx;
+		c->x86_capability[CPUID_1_EDX] = edx;
 	}
 
 	/* Additional Intel-defined flags: level 0x00000007 */
 	if (c->cpuid_level >= 0x00000007) {
-		u32 eax, ebx, ecx, edx;
-
 		cpuid_count(0x00000007, 0, &eax, &ebx, &ecx, &edx);
 
-		c->x86_capability[9] = ebx;
+		c->x86_capability[CPUID_7_0_EBX] = ebx;
 
-		c->x86_capability[14] = cpuid_eax(0x00000006);
+		c->x86_capability[CPUID_6_EAX] = cpuid_eax(0x00000006);
 	}
 
 	/* Extended state features: level 0x0000000d */
 	if (c->cpuid_level >= 0x0000000d) {
-		u32 eax, ebx, ecx, edx;
-
 		cpuid_count(0x0000000d, 1, &eax, &ebx, &ecx, &edx);
 
-		c->x86_capability[10] = eax;
+		c->x86_capability[CPUID_D_1_EAX] = eax;
 	}
 
 	/* Additional Intel-defined flags: level 0x0000000F */
 	if (c->cpuid_level >= 0x0000000F) {
-		u32 eax, ebx, ecx, edx;
 
 		/* QoS sub-leaf, EAX=0Fh, ECX=0 */
 		cpuid_count(0x0000000F, 0, &eax, &ebx, &ecx, &edx);
-		c->x86_capability[11] = edx;
+		c->x86_capability[CPUID_F_0_EDX] = edx;
+
 		if (cpu_has(c, X86_FEATURE_CQM_LLC)) {
 			/* will be overridden if occupancy monitoring exists */
 			c->x86_cache_max_rmid = ebx;
 
 			/* QoS sub-leaf, EAX=0Fh, ECX=1 */
 			cpuid_count(0x0000000F, 1, &eax, &ebx, &ecx, &edx);
-			c->x86_capability[12] = edx;
+			c->x86_capability[CPUID_F_1_EDX] = edx;
+
 			if (cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC)) {
 				c->x86_cache_max_rmid = ecx;
 				c->x86_cache_occ_scale = ebx;
@@ -657,22 +652,24 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
 	}
 
 	/* AMD-defined flags: level 0x80000001 */
-	xlvl = cpuid_eax(0x80000000);
-	c->extended_cpuid_level = xlvl;
+	eax = cpuid_eax(0x80000000);
+	c->extended_cpuid_level = eax;
+
+	if ((eax & 0xffff0000) == 0x80000000) {
+		if (eax >= 0x80000001) {
+			cpuid(0x80000001, &eax, &ebx, &ecx, &edx);
 
-	if ((xlvl & 0xffff0000) == 0x80000000) {
-		if (xlvl >= 0x80000001) {
-			c->x86_capability[1] = cpuid_edx(0x80000001);
-			c->x86_capability[6] = cpuid_ecx(0x80000001);
+			c->x86_capability[CPUID_8000_0001_ECX] = ecx;
+			c->x86_capability[CPUID_8000_0001_EDX] = edx;
 		}
 	}
 
 	if (c->extended_cpuid_level >= 0x80000008) {
-		u32 eax = cpuid_eax(0x80000008);
+		cpuid(0x80000008, &eax, &ebx, &ecx, &edx);
 
 		c->x86_virt_bits = (eax >> 8) & 0xff;
 		c->x86_phys_bits = eax & 0xff;
-		c->x86_capability[13] = cpuid_ebx(0x80000008);
+		c->x86_capability[CPUID_8000_0008_EBX] = ebx;
 	}
 #ifdef CONFIG_X86_32
 	else if (cpu_has(c, X86_FEATURE_PAE) || cpu_has(c, X86_FEATURE_PSE36))
@@ -683,7 +680,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
 		c->x86_power = cpuid_edx(0x80000007);
 
 	if (c->extended_cpuid_level >= 0x8000000a)
-		c->x86_capability[15] = cpuid_edx(0x8000000a);
+		c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);
 
 	init_scattered_cpuid_features(c);
 }
diff --git a/arch/x86/kernel/cpu/transmeta.c b/arch/x86/kernel/cpu/transmeta.c
index 3fa0e5ad86b4..252da7aceca6 100644
--- a/arch/x86/kernel/cpu/transmeta.c
+++ b/arch/x86/kernel/cpu/transmeta.c
@@ -12,7 +12,7 @@ static void early_init_transmeta(struct cpuinfo_x86 *c)
 	xlvl = cpuid_eax(0x80860000);
 	if ((xlvl & 0xffff0000) == 0x80860000) {
 		if (xlvl >= 0x80860001)
-			c->x86_capability[2] = cpuid_edx(0x80860001);
+			c->x86_capability[CPUID_8086_0001_EDX] = cpuid_edx(0x80860001);
 	}
 }
 
@@ -82,7 +82,7 @@ static void init_transmeta(struct cpuinfo_x86 *c)
 	/* Unhide possibly hidden capability flags */
 	rdmsr(0x80860004, cap_mask, uk);
 	wrmsr(0x80860004, ~0, uk);
-	c->x86_capability[0] = cpuid_edx(0x00000001);
+	c->x86_capability[CPUID_1_EDX] = cpuid_edx(0x00000001);
 	wrmsr(0x80860004, cap_mask, uk);
 
 	/* All Transmeta CPUs have a constant TSC */
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros
  2015-11-10 11:48 [RFC PATCH 0/3] x86/cpufeature: Cleanup stuff Borislav Petkov
  2015-11-10 11:48 ` [RFC PATCH 1/3] x86/cpufeature: Move some of the scattered feature bits to x86_capability Borislav Petkov
  2015-11-10 11:48 ` [RFC PATCH 2/3] x86/cpufeature: Cleanup get_cpu_cap() Borislav Petkov
@ 2015-11-10 11:48 ` Borislav Petkov
  2015-11-10 11:57   ` David Sterba
                     ` (2 more replies)
  2 siblings, 3 replies; 16+ messages in thread
From: Borislav Petkov @ 2015-11-10 11:48 UTC (permalink / raw)
  To: X86 ML
  Cc: LKML, Peter Zijlstra, Andy Lutomirski, Herbert Xu, Matt Mackall,
	Chris Mason, Josef Bacik, David Sterba

From: Borislav Petkov <bp@suse.de>

Those are stupid and code should use static_cpu_has_safe() anyway. Kill
the least used and unused ones.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: David Sterba <dsterba@suse.com>
---
 arch/x86/crypto/chacha20_glue.c             |  2 +-
 arch/x86/crypto/crc32c-intel_glue.c         |  3 ++-
 arch/x86/include/asm/cmpxchg_32.h           |  2 +-
 arch/x86/include/asm/cpufeature.h           | 32 +++--------------------------
 arch/x86/include/asm/smp.h                  |  2 +-
 arch/x86/kernel/cpu/amd.c                   |  2 +-
 arch/x86/kernel/cpu/intel.c                 |  3 ++-
 arch/x86/kernel/cpu/mtrr/generic.c          |  2 +-
 arch/x86/kernel/cpu/mtrr/main.c             |  2 +-
 arch/x86/kernel/cpu/perf_event_amd.c        |  4 ++--
 arch/x86/kernel/cpu/perf_event_amd_uncore.c |  8 ++++----
 arch/x86/kernel/fpu/init.c                  |  4 ++--
 arch/x86/kernel/hw_breakpoint.c             |  3 ++-
 arch/x86/kernel/vm86_32.c                   |  4 +++-
 arch/x86/mm/setup_nx.c                      |  4 ++--
 drivers/char/hw_random/via-rng.c            |  5 +++--
 drivers/crypto/padlock-aes.c                |  2 +-
 drivers/crypto/padlock-sha.c                |  3 ++-
 fs/btrfs/disk-io.c                          |  2 +-
 19 files changed, 35 insertions(+), 54 deletions(-)

diff --git a/arch/x86/crypto/chacha20_glue.c b/arch/x86/crypto/chacha20_glue.c
index 722bacea040e..8a7f1375ece4 100644
--- a/arch/x86/crypto/chacha20_glue.c
+++ b/arch/x86/crypto/chacha20_glue.c
@@ -125,7 +125,7 @@ static struct crypto_alg alg = {
 
 static int __init chacha20_simd_mod_init(void)
 {
-	if (!cpu_has_ssse3)
+	if (!static_cpu_has_safe(X86_FEATURE_SSSE3))
 		return -ENODEV;
 
 #ifdef CONFIG_AS_AVX2
diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c
index 81a595d75cf5..72a991fb643f 100644
--- a/arch/x86/crypto/crc32c-intel_glue.c
+++ b/arch/x86/crypto/crc32c-intel_glue.c
@@ -256,8 +256,9 @@ static int __init crc32c_intel_mod_init(void)
 {
 	if (!x86_match_cpu(crc32c_cpu_id))
 		return -ENODEV;
+
 #ifdef CONFIG_X86_64
-	if (cpu_has_pclmulqdq) {
+	if (static_cpu_has_safe(X86_FEATURE_PCLMULQDQ)) {
 		alg.update = crc32c_pcl_intel_update;
 		alg.finup = crc32c_pcl_intel_finup;
 		alg.digest = crc32c_pcl_intel_digest;
diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h
index f7e142926481..2c88969f78c7 100644
--- a/arch/x86/include/asm/cmpxchg_32.h
+++ b/arch/x86/include/asm/cmpxchg_32.h
@@ -109,6 +109,6 @@ static inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new)
 
 #endif
 
-#define system_has_cmpxchg_double() cpu_has_cx8
+#define system_has_cmpxchg_double() static_cpu_has_safe(X86_FEATURE_CX8)
 
 #endif /* _ASM_X86_CMPXCHG_32_H */
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 35401fef0d75..27ab2e7d14c4 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -390,53 +390,27 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 #define cpu_has_tsc		boot_cpu_has(X86_FEATURE_TSC)
 #define cpu_has_pge		boot_cpu_has(X86_FEATURE_PGE)
 #define cpu_has_apic		boot_cpu_has(X86_FEATURE_APIC)
-#define cpu_has_sep		boot_cpu_has(X86_FEATURE_SEP)
-#define cpu_has_mtrr		boot_cpu_has(X86_FEATURE_MTRR)
 #define cpu_has_mmx		boot_cpu_has(X86_FEATURE_MMX)
 #define cpu_has_fxsr		boot_cpu_has(X86_FEATURE_FXSR)
 #define cpu_has_xmm		boot_cpu_has(X86_FEATURE_XMM)
 #define cpu_has_xmm2		boot_cpu_has(X86_FEATURE_XMM2)
-#define cpu_has_xmm3		boot_cpu_has(X86_FEATURE_XMM3)
-#define cpu_has_ssse3		boot_cpu_has(X86_FEATURE_SSSE3)
 #define cpu_has_aes		boot_cpu_has(X86_FEATURE_AES)
 #define cpu_has_avx		boot_cpu_has(X86_FEATURE_AVX)
 #define cpu_has_avx2		boot_cpu_has(X86_FEATURE_AVX2)
-#define cpu_has_ht		boot_cpu_has(X86_FEATURE_HT)
-#define cpu_has_nx		boot_cpu_has(X86_FEATURE_NX)
-#define cpu_has_xstore		boot_cpu_has(X86_FEATURE_XSTORE)
-#define cpu_has_xstore_enabled	boot_cpu_has(X86_FEATURE_XSTORE_EN)
-#define cpu_has_xcrypt		boot_cpu_has(X86_FEATURE_XCRYPT)
-#define cpu_has_xcrypt_enabled	boot_cpu_has(X86_FEATURE_XCRYPT_EN)
-#define cpu_has_ace2		boot_cpu_has(X86_FEATURE_ACE2)
-#define cpu_has_ace2_enabled	boot_cpu_has(X86_FEATURE_ACE2_EN)
-#define cpu_has_phe		boot_cpu_has(X86_FEATURE_PHE)
-#define cpu_has_phe_enabled	boot_cpu_has(X86_FEATURE_PHE_EN)
-#define cpu_has_pmm		boot_cpu_has(X86_FEATURE_PMM)
-#define cpu_has_pmm_enabled	boot_cpu_has(X86_FEATURE_PMM_EN)
-#define cpu_has_ds		boot_cpu_has(X86_FEATURE_DS)
-#define cpu_has_pebs		boot_cpu_has(X86_FEATURE_PEBS)
 #define cpu_has_clflush		boot_cpu_has(X86_FEATURE_CLFLUSH)
-#define cpu_has_bts		boot_cpu_has(X86_FEATURE_BTS)
 #define cpu_has_gbpages		boot_cpu_has(X86_FEATURE_GBPAGES)
 #define cpu_has_arch_perfmon	boot_cpu_has(X86_FEATURE_ARCH_PERFMON)
 #define cpu_has_pat		boot_cpu_has(X86_FEATURE_PAT)
-#define cpu_has_xmm4_1		boot_cpu_has(X86_FEATURE_XMM4_1)
-#define cpu_has_xmm4_2		boot_cpu_has(X86_FEATURE_XMM4_2)
 #define cpu_has_x2apic		boot_cpu_has(X86_FEATURE_X2APIC)
 #define cpu_has_xsave		boot_cpu_has(X86_FEATURE_XSAVE)
-#define cpu_has_xsaveopt	boot_cpu_has(X86_FEATURE_XSAVEOPT)
 #define cpu_has_xsaves		boot_cpu_has(X86_FEATURE_XSAVES)
 #define cpu_has_osxsave		boot_cpu_has(X86_FEATURE_OSXSAVE)
 #define cpu_has_hypervisor	boot_cpu_has(X86_FEATURE_HYPERVISOR)
-#define cpu_has_pclmulqdq	boot_cpu_has(X86_FEATURE_PCLMULQDQ)
-#define cpu_has_perfctr_core	boot_cpu_has(X86_FEATURE_PERFCTR_CORE)
-#define cpu_has_perfctr_nb	boot_cpu_has(X86_FEATURE_PERFCTR_NB)
-#define cpu_has_perfctr_l2	boot_cpu_has(X86_FEATURE_PERFCTR_L2)
-#define cpu_has_cx8		boot_cpu_has(X86_FEATURE_CX8)
 #define cpu_has_cx16		boot_cpu_has(X86_FEATURE_CX16)
-#define cpu_has_eager_fpu	boot_cpu_has(X86_FEATURE_EAGER_FPU)
 #define cpu_has_topoext		boot_cpu_has(X86_FEATURE_TOPOEXT)
-#define cpu_has_bpext		boot_cpu_has(X86_FEATURE_BPEXT)
+/*
+ * Do not add any more of those clumsy macros - use static_cpu_has_safe()!
+ */
 
 #if __GNUC__ >= 4
 extern void warn_pre_alternatives(void);
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 222a6a3ca2b5..a8578776c5cb 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -25,7 +25,7 @@ static inline bool cpu_has_ht_siblings(void)
 {
 	bool has_siblings = false;
 #ifdef CONFIG_SMP
-	has_siblings = cpu_has_ht && smp_num_siblings > 1;
+	has_siblings = static_cpu_has_safe(X86_FEATURE_HT) && smp_num_siblings > 1;
 #endif
 	return has_siblings;
 }
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 4a70fc6d400a..c018ca641112 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -909,7 +909,7 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
 
 void set_dr_addr_mask(unsigned long mask, int dr)
 {
-	if (!cpu_has_bpext)
+	if (!static_cpu_has_safe(X86_FEATURE_BPEXT))
 		return;
 
 	switch (dr) {
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 98a13db5f4be..5fc71c43dc22 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -444,7 +444,8 @@ static void init_intel(struct cpuinfo_x86 *c)
 
 	if (cpu_has_xmm2)
 		set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
-	if (cpu_has_ds) {
+
+	if (static_cpu_has_safe(X86_FEATURE_DS)) {
 		unsigned int l1;
 		rdmsr(MSR_IA32_MISC_ENABLE, l1, l2);
 		if (!(l1 & (1<<11)))
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 3b533cf37c74..8f2ef910c7bf 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -349,7 +349,7 @@ static void get_fixed_ranges(mtrr_type *frs)
 
 void mtrr_save_fixed_ranges(void *info)
 {
-	if (cpu_has_mtrr)
+	if (static_cpu_has_safe(X86_FEATURE_MTRR))
 		get_fixed_ranges(mtrr_state.fixed_ranges);
 }
 
diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index f891b4750f04..2c01181236fc 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -682,7 +682,7 @@ void __init mtrr_bp_init(void)
 
 	phys_addr = 32;
 
-	if (cpu_has_mtrr) {
+	if (static_cpu_has_safe(X86_FEATURE_MTRR)) {
 		mtrr_if = &generic_mtrr_ops;
 		size_or_mask = SIZE_OR_MASK_BITS(36);
 		size_and_mask = 0x00f00000;
diff --git a/arch/x86/kernel/cpu/perf_event_amd.c b/arch/x86/kernel/cpu/perf_event_amd.c
index 1cee5d2d7ece..86fac8bf627c 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -160,7 +160,7 @@ static inline int amd_pmu_addr_offset(int index, bool eventsel)
 	if (offset)
 		return offset;
 
-	if (!cpu_has_perfctr_core)
+	if (!static_cpu_has_safe(X86_FEATURE_PERFCTR_CORE))
 		offset = index;
 	else
 		offset = index << 1;
@@ -652,7 +652,7 @@ static __initconst const struct x86_pmu amd_pmu = {
 
 static int __init amd_core_pmu_init(void)
 {
-	if (!cpu_has_perfctr_core)
+	if (!static_cpu_has_safe(X86_FEATURE_PERFCTR_CORE))
 		return 0;
 
 	switch (boot_cpu_data.x86) {
diff --git a/arch/x86/kernel/cpu/perf_event_amd_uncore.c b/arch/x86/kernel/cpu/perf_event_amd_uncore.c
index cc6cedb8f25d..533e5eab7d94 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_uncore.c
@@ -526,7 +526,7 @@ static int __init amd_uncore_init(void)
 	if (!cpu_has_topoext)
 		goto fail_nodev;
 
-	if (cpu_has_perfctr_nb) {
+	if (static_cpu_has_safe(X86_FEATURE_PERFCTR_NB)) {
 		amd_uncore_nb = alloc_percpu(struct amd_uncore *);
 		if (!amd_uncore_nb) {
 			ret = -ENOMEM;
@@ -540,7 +540,7 @@ static int __init amd_uncore_init(void)
 		ret = 0;
 	}
 
-	if (cpu_has_perfctr_l2) {
+	if (static_cpu_has_safe(X86_FEATURE_PERFCTR_L2)) {
 		amd_uncore_l2 = alloc_percpu(struct amd_uncore *);
 		if (!amd_uncore_l2) {
 			ret = -ENOMEM;
@@ -583,10 +583,10 @@ fail_online:
 
 	/* amd_uncore_nb/l2 should have been freed by cleanup_cpu_online */
 	amd_uncore_nb = amd_uncore_l2 = NULL;
-	if (cpu_has_perfctr_l2)
+	if (static_cpu_has_safe(X86_FEATURE_PERFCTR_L2))
 		perf_pmu_unregister(&amd_l2_pmu);
 fail_l2:
-	if (cpu_has_perfctr_nb)
+	if (static_cpu_has_safe(X86_FEATURE_PERFCTR_NB))
 		perf_pmu_unregister(&amd_nb_pmu);
 	if (amd_uncore_l2)
 		free_percpu(amd_uncore_l2);
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index be39b5fde4b9..c5674bbecc85 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -12,7 +12,7 @@
  */
 static void fpu__init_cpu_ctx_switch(void)
 {
-	if (!cpu_has_eager_fpu)
+	if (!static_cpu_has_safe(X86_FEATURE_EAGER_FPU))
 		stts();
 	else
 		clts();
@@ -287,7 +287,7 @@ static void __init fpu__init_system_ctx_switch(void)
 	current_thread_info()->status = 0;
 
 	/* Auto enable eagerfpu for xsaveopt */
-	if (cpu_has_xsaveopt && eagerfpu != DISABLE)
+	if (static_cpu_has_safe(X86_FEATURE_XSAVEOPT) && eagerfpu != DISABLE)
 		eagerfpu = ENABLE;
 
 	if (xfeatures_mask & XFEATURE_MASK_EAGER) {
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 50a3fad5b89f..27c69ec5fc30 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -307,8 +307,9 @@ static int arch_build_bp_info(struct perf_event *bp)
 		 * breakpoints, then we'll have to check for kprobe-blacklisted
 		 * addresses anywhere in the range.
 		 */
-		if (!cpu_has_bpext)
+		if (!static_cpu_has_safe(X86_FEATURE_BPEXT))
 			return -EOPNOTSUPP;
+
 		info->mask = bp->attr.bp_len - 1;
 		info->len = X86_BREAKPOINT_LEN_1;
 	}
diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c
index 524619351961..483231ebbb0b 100644
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -357,8 +357,10 @@ static long do_sys_vm86(struct vm86plus_struct __user *user_vm86, bool plus)
 	tss = &per_cpu(cpu_tss, get_cpu());
 	/* make room for real-mode segments */
 	tsk->thread.sp0 += 16;
-	if (cpu_has_sep)
+
+	if (static_cpu_has_safe(X86_FEATURE_SEP))
 		tsk->thread.sysenter_cs = 0;
+
 	load_sp0(tss, &tsk->thread);
 	put_cpu();
 
diff --git a/arch/x86/mm/setup_nx.c b/arch/x86/mm/setup_nx.c
index 90555bf60aa4..595dc8e019a1 100644
--- a/arch/x86/mm/setup_nx.c
+++ b/arch/x86/mm/setup_nx.c
@@ -31,7 +31,7 @@ early_param("noexec", noexec_setup);
 
 void x86_configure_nx(void)
 {
-	if (cpu_has_nx && !disable_nx)
+	if (static_cpu_has_safe(X86_FEATURE_NX) && !disable_nx)
 		__supported_pte_mask |= _PAGE_NX;
 	else
 		__supported_pte_mask &= ~_PAGE_NX;
@@ -39,7 +39,7 @@ void x86_configure_nx(void)
 
 void __init x86_report_nx(void)
 {
-	if (!cpu_has_nx) {
+	if (!static_cpu_has_safe(X86_FEATURE_NX)) {
 		printk(KERN_NOTICE "Notice: NX (Execute Disable) protection "
 		       "missing in CPU!\n");
 	} else {
diff --git a/drivers/char/hw_random/via-rng.c b/drivers/char/hw_random/via-rng.c
index 0c98a9d51a24..6052f619fd39 100644
--- a/drivers/char/hw_random/via-rng.c
+++ b/drivers/char/hw_random/via-rng.c
@@ -140,7 +140,7 @@ static int via_rng_init(struct hwrng *rng)
 	 * RNG configuration like it used to be the case in this
 	 * register */
 	if ((c->x86 == 6) && (c->x86_model >= 0x0f)) {
-		if (!cpu_has_xstore_enabled) {
+		if (!static_cpu_has_safe(X86_FEATURE_XSTORE_EN)) {
 			pr_err(PFX "can't enable hardware RNG "
 				"if XSTORE is not enabled\n");
 			return -ENODEV;
@@ -200,8 +200,9 @@ static int __init mod_init(void)
 {
 	int err;
 
-	if (!cpu_has_xstore)
+	if (!static_cpu_has_safe(X86_FEATURE_XSTORE))
 		return -ENODEV;
+
 	pr_info("VIA RNG detected\n");
 	err = hwrng_register(&via_rng);
 	if (err) {
diff --git a/drivers/crypto/padlock-aes.c b/drivers/crypto/padlock-aes.c
index da2d6777bd09..360e941968dd 100644
--- a/drivers/crypto/padlock-aes.c
+++ b/drivers/crypto/padlock-aes.c
@@ -515,7 +515,7 @@ static int __init padlock_init(void)
 	if (!x86_match_cpu(padlock_cpu_id))
 		return -ENODEV;
 
-	if (!cpu_has_xcrypt_enabled) {
+	if (!static_cpu_has_safe(X86_FEATURE_XCRYPT_EN)) {
 		printk(KERN_NOTICE PFX "VIA PadLock detected, but not enabled. Hmm, strange...\n");
 		return -ENODEV;
 	}
diff --git a/drivers/crypto/padlock-sha.c b/drivers/crypto/padlock-sha.c
index 4e154c9b9206..18fec3381178 100644
--- a/drivers/crypto/padlock-sha.c
+++ b/drivers/crypto/padlock-sha.c
@@ -540,7 +540,8 @@ static int __init padlock_init(void)
 	struct shash_alg *sha1;
 	struct shash_alg *sha256;
 
-	if (!x86_match_cpu(padlock_sha_ids) || !cpu_has_phe_enabled)
+	if (!x86_match_cpu(padlock_sha_ids) ||
+	    !static_cpu_has_safe(X86_FEATURE_PHE_EN))
 		return -ENODEV;
 
 	/* Register the newly added algorithm module if on *
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 640598c0d0e7..d4544172e399 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -923,7 +923,7 @@ static int check_async_write(struct inode *inode, unsigned long bio_flags)
 	if (bio_flags & EXTENT_BIO_TREE_LOG)
 		return 0;
 #ifdef CONFIG_X86
-	if (cpu_has_xmm4_2)
+	if (static_cpu_has_safe(X86_FEATURE_XMM4_2))
 		return 0;
 #endif
 	return 1;
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros
  2015-11-10 11:48 ` [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros Borislav Petkov
@ 2015-11-10 11:57   ` David Sterba
  2015-11-10 12:30   ` Ingo Molnar
  2015-11-24 13:05   ` Borislav Petkov
  2 siblings, 0 replies; 16+ messages in thread
From: David Sterba @ 2015-11-10 11:57 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: X86 ML, Andy Lutomirski, Peter Zijlstra, Chris Mason,
	Josef Bacik, Herbert Xu, Matt Mackall, LKML

On Tue, Nov 10, 2015 at 12:48:42PM +0100, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
> 
> Those are stupid and code should use static_cpu_has_safe() anyway. Kill
> the least used and unused ones.
> 
> Signed-off-by: Borislav Petkov <bp@suse.de>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Matt Mackall <mpm@selenic.com>
> Cc: Chris Mason <clm@fb.com>
> Cc: Josef Bacik <jbacik@fb.com>
> Cc: David Sterba <dsterba@suse.com>

Acked-by: David Sterba <dsterba@suse.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros
  2015-11-10 11:48 ` [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros Borislav Petkov
  2015-11-10 11:57   ` David Sterba
@ 2015-11-10 12:30   ` Ingo Molnar
  2015-11-10 12:37     ` Borislav Petkov
  2015-11-18 18:23     ` Borislav Petkov
  2015-11-24 13:05   ` Borislav Petkov
  2 siblings, 2 replies; 16+ messages in thread
From: Ingo Molnar @ 2015-11-10 12:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: X86 ML, LKML, Peter Zijlstra, Andy Lutomirski, Herbert Xu,
	Matt Mackall, Chris Mason, Josef Bacik, David Sterba


* Borislav Petkov <bp@alien8.de> wrote:

> From: Borislav Petkov <bp@suse.de>
> 
> Those are stupid and code should use static_cpu_has_safe() anyway. Kill
> the least used and unused ones.

So cpufeature.h doesn't really do a good job of explaining what the difference is 
between all these variants:

	cpu_has()
	static_cpu_has()
	static_cpu_has_safe()

it has this comment:

/*
 * Static testing of CPU features.  Used the same as boot_cpu_has().
 * These are only valid after alternatives have run, but will statically
 * patch the target code for additional performance.
 */

The second sentence does not parse. Why does the third sentence have a 'but' for 
listing properties? It's either bad grammer or tries to tell something that isn't 
being told properly.

It's entirely silent on the difference between static_cpu_has() and 
static_cpu_has_safe() - what makes the second one 'safe'?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros
  2015-11-10 12:30   ` Ingo Molnar
@ 2015-11-10 12:37     ` Borislav Petkov
  2015-11-18 18:23     ` Borislav Petkov
  1 sibling, 0 replies; 16+ messages in thread
From: Borislav Petkov @ 2015-11-10 12:37 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: X86 ML, LKML, Peter Zijlstra, Andy Lutomirski, Herbert Xu,
	Matt Mackall, Chris Mason, Josef Bacik, David Sterba

On Tue, Nov 10, 2015 at 01:30:00PM +0100, Ingo Molnar wrote:
...

> It's entirely silent on the difference between static_cpu_has() and 
> static_cpu_has_safe() - what makes the second one 'safe'?

Yeah, those really are lacking some more fleshy and detailed
explanations. :-)

I'll document those properly.

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros
  2015-11-10 12:30   ` Ingo Molnar
  2015-11-10 12:37     ` Borislav Petkov
@ 2015-11-18 18:23     ` Borislav Petkov
  1 sibling, 0 replies; 16+ messages in thread
From: Borislav Petkov @ 2015-11-18 18:23 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: X86 ML, LKML, Peter Zijlstra, Andy Lutomirski, Herbert Xu,
	Matt Mackall, Chris Mason, Josef Bacik, David Sterba

On Tue, Nov 10, 2015 at 01:30:00PM +0100, Ingo Molnar wrote:
> 
> * Borislav Petkov <bp@alien8.de> wrote:
> 
> > From: Borislav Petkov <bp@suse.de>
> > 
> > Those are stupid and code should use static_cpu_has_safe() anyway. Kill
> > the least used and unused ones.
> 
> So cpufeature.h doesn't really do a good job of explaining what the difference is 
> between all these variants:

How's that for starters?

---
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 27ab2e7d14c4..a9a8313e278e 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -351,6 +351,10 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 	   (((bit)>>5)==8 && (1UL<<((bit)&31) & DISABLED_MASK8)) ||	\
 	   (((bit)>>5)==9 && (1UL<<((bit)&31) & DISABLED_MASK9)) )
 
+/*
+ * Test whether the CPU represented by descriptor @c has the feature bit @bit
+ * set.
+ */
 #define cpu_has(c, bit)							\
 	(__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 :	\
 	 test_cpu_cap(c, bit))
@@ -416,11 +420,6 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 extern void warn_pre_alternatives(void);
 extern bool __static_cpu_has_safe(u16 bit);
 
-/*
- * Static testing of CPU features.  Used the same as boot_cpu_has().
- * These are only valid after alternatives have run, but will statically
- * patch the target code for additional performance.
- */
 static __always_inline __pure bool __static_cpu_has(u16 bit)
 {
 #ifdef CC_HAVE_ASM_GOTO
@@ -495,6 +494,18 @@ static __always_inline __pure bool __static_cpu_has(u16 bit)
 #endif /* CC_HAVE_ASM_GOTO */
 }
 
+/*
+ * Test whether the boot CPU has feature bit @bit enabled.
+ *
+ * This is static testing of CPU features. It is used in the same manner as
+ * boot_cpu_has(). It is differs from the previous one in that the alternatives
+ * infrastructure will statically patch the code where the test is performed for
+ * additional performance.
+ *
+ * However, results from that macro are only valid after the alternatives have
+ * run and not before that. IOW, you want static_cpu_has_safe() instead, see
+ * below.
+ */
 #define static_cpu_has(bit)					\
 (								\
 	__builtin_constant_p(boot_cpu_has(bit)) ?		\
@@ -580,6 +591,11 @@ static __always_inline __pure bool _static_cpu_has_safe(u16 bit)
 #endif /* CC_HAVE_ASM_GOTO */
 }
 
+/*
+ * Like static_cpu_has() above but it works even before the alternatives have
+ * run by falling back to boot_cpu_has(). You should use that macro for all your
+ * CPU feature bit testing needs.
+ */
 #define static_cpu_has_safe(bit)				\
 (								\
 	__builtin_constant_p(boot_cpu_has(bit)) ?		\

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros
  2015-11-10 11:48 ` [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros Borislav Petkov
  2015-11-10 11:57   ` David Sterba
  2015-11-10 12:30   ` Ingo Molnar
@ 2015-11-24 13:05   ` Borislav Petkov
  2015-11-24 22:42     ` Josh Triplett
  2 siblings, 1 reply; 16+ messages in thread
From: Borislav Petkov @ 2015-11-24 13:05 UTC (permalink / raw)
  To: X86 ML, Josh Triplett
  Cc: LKML, Peter Zijlstra, Andy Lutomirski, Herbert Xu, Matt Mackall,
	Chris Mason, Josef Bacik, David Sterba, kbuild test robot

On Tue, Nov 10, 2015 at 12:48:42PM +0100, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
> 
> Those are stupid and code should use static_cpu_has_safe() anyway. Kill
> the least used and unused ones.
> 
> Signed-off-by: Borislav Petkov <bp@suse.de>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Matt Mackall <mpm@selenic.com>
> Cc: Chris Mason <clm@fb.com>
> Cc: Josef Bacik <jbacik@fb.com>
> Cc: David Sterba <dsterba@suse.com>
> ---
>  arch/x86/crypto/chacha20_glue.c             |  2 +-
>  arch/x86/crypto/crc32c-intel_glue.c         |  3 ++-
>  arch/x86/include/asm/cmpxchg_32.h           |  2 +-
>  arch/x86/include/asm/cpufeature.h           | 32 +++--------------------------
>  arch/x86/include/asm/smp.h                  |  2 +-
>  arch/x86/kernel/cpu/amd.c                   |  2 +-
>  arch/x86/kernel/cpu/intel.c                 |  3 ++-
>  arch/x86/kernel/cpu/mtrr/generic.c          |  2 +-
>  arch/x86/kernel/cpu/mtrr/main.c             |  2 +-
>  arch/x86/kernel/cpu/perf_event_amd.c        |  4 ++--
>  arch/x86/kernel/cpu/perf_event_amd_uncore.c |  8 ++++----
>  arch/x86/kernel/fpu/init.c                  |  4 ++--
>  arch/x86/kernel/hw_breakpoint.c             |  3 ++-
>  arch/x86/kernel/vm86_32.c                   |  4 +++-
>  arch/x86/mm/setup_nx.c                      |  4 ++--
>  drivers/char/hw_random/via-rng.c            |  5 +++--
>  drivers/crypto/padlock-aes.c                |  2 +-
>  drivers/crypto/padlock-sha.c                |  3 ++-
>  fs/btrfs/disk-io.c                          |  2 +-
>  19 files changed, 35 insertions(+), 54 deletions(-)

Ok, 0day says this patch makes tiny not so tiny:

i386-tinyconfig vmlinux size:

+-------+------+-------+-----+--------------------------------------------------------------------------------------+
| TOTAL | TEXT | DATA  | BSS |                                                                                      |
+-------+------+-------+-----+--------------------------------------------------------------------------------------+
| +4646 |  +64 | +4096 |   0 | ab9976b5af96 x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros       |
|   -32 |  -32 |     0 |   0 | 13e835020a02 x86/cpufeature: Cleanup get_cpu_cap()                                   |
|   +32 |  +32 |     0 |   0 | 3615f94f0486 x86/cpufeature: Move some of the scattered feature bits to x86_capabili |
|  +136 |  +32 |     0 |   0 | 506d983184f4 Merge branch 'tip-fpu-xsave' into rc2+                                  |
| +4782 |  +96 | +4096 |   0 | ALL COMMITS                                                                          |
+-------+------+-------+-----+--------------------------------------------------------------------------------------+

Btw, thanks 0day!

The problem comes from static_cpu_has_safe() adding the alternatives and
fallback machinery. For example, before the patch, we had this at the
cpu_has_* testing sites:

        movl    boot_cpu_data+20, %eax  # MEM[(const long unsigned int *)&boot_cpu_data + 20B], D.19113
        testl   $2097152, %eax  #, D.19113
        je      .L166   #,

and now we get this:

#APP
# 449 "arch/x86/kernel/cpu/intel.c" 1
# 0 "" 2
# 511 "./arch/x86/include/asm/cpufeature.h" 1
        1: jmp .L166    #
2:
.skip -(((5f-4f) - (2b-1b)) > 0) * ((5f-4f) - (2b-1b)),0x90
3:
.section .altinstructions,"a"
 .long 1b - .
 .long 4f - .
 .word 117      #
 .byte 3b - 1b
 .byte 5f - 4f
 .byte 3b - 2b
.previous
.section .altinstr_replacement,"ax"
4: jmp .L167    #
5:
.previous
.section .altinstructions,"a"
 .long 1b - .
 .long 0
 .word 21       #
 .byte 3b - 1b
 .byte 0
 .byte 0
.previous

# 0 "" 2
#NO_APP
        jmp     .L168   #
.L166:
        movl    $21, %eax       #,
        call    __static_cpu_has_safe   #
        testb   %al, %al        # D.19126
        je      .L167   #,
.L168:
#APP
# 453 "arch/x86/kernel/cpu/intel.c" 1
# 0 "" 2
#NO_APP

That gets spread among .altinstructions, .altinstr_replacement, .text
etc sections. .data grows too probably because of the NOP padding :-\

		   text    data     bss     dec     hex filename
before:		 644896  127436 1189384 1961716  1deef4 vmlinux
after:		 645446  131532 1189384 1966362  1e011a vmlinux

	[Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
before:	[12] .altinstructions  PROGBITS        c10bdf48 0bef48 000680 00   A  0   0  1
after:	[12] .altinstructions  PROGBITS        c10bff48 0c0f48 0007d2 00   A  0   0  1

before:	[13] .altinstr_replace PROGBITS        c10be5c8 0bf5c8 00016c 00  AX  0   0  1
after:	[13] .altinstr_replace PROGBITS        c10c071a 0c171a 0001ad 00  AX  0   0  1

before:	[ 7] .data             PROGBITS        c1092000 093000 0132a0 00  WA  0   0 4096
after:	[ 7] .data             PROGBITS        c1093000 094000 0142a0 00  WA  0   0 4096

So I'm wondering if we should make a config option which converts
static_cpu_has* macros to boot_cpu_has()? That should slim down
the kernel even more but it won't benefit from the speedup of the
static_cpu_has* stuff.

Josh, thoughts?

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros
  2015-11-24 13:05   ` Borislav Petkov
@ 2015-11-24 22:42     ` Josh Triplett
  2015-11-25  0:10       ` Andy Lutomirski
  2015-11-27 13:52       ` Borislav Petkov
  0 siblings, 2 replies; 16+ messages in thread
From: Josh Triplett @ 2015-11-24 22:42 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: X86 ML, LKML, Peter Zijlstra, Andy Lutomirski, Herbert Xu,
	Matt Mackall, Chris Mason, Josef Bacik, David Sterba,
	kbuild test robot

On Tue, Nov 24, 2015 at 02:05:10PM +0100, Borislav Petkov wrote:
> On Tue, Nov 10, 2015 at 12:48:42PM +0100, Borislav Petkov wrote:
> > From: Borislav Petkov <bp@suse.de>
> > 
> > Those are stupid and code should use static_cpu_has_safe() anyway. Kill
> > the least used and unused ones.
> > 
> > Signed-off-by: Borislav Petkov <bp@suse.de>
> > Cc: Herbert Xu <herbert@gondor.apana.org.au>
> > Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > Cc: Matt Mackall <mpm@selenic.com>
> > Cc: Chris Mason <clm@fb.com>
> > Cc: Josef Bacik <jbacik@fb.com>
> > Cc: David Sterba <dsterba@suse.com>
> > ---
> >  arch/x86/crypto/chacha20_glue.c             |  2 +-
> >  arch/x86/crypto/crc32c-intel_glue.c         |  3 ++-
> >  arch/x86/include/asm/cmpxchg_32.h           |  2 +-
> >  arch/x86/include/asm/cpufeature.h           | 32 +++--------------------------
> >  arch/x86/include/asm/smp.h                  |  2 +-
> >  arch/x86/kernel/cpu/amd.c                   |  2 +-
> >  arch/x86/kernel/cpu/intel.c                 |  3 ++-
> >  arch/x86/kernel/cpu/mtrr/generic.c          |  2 +-
> >  arch/x86/kernel/cpu/mtrr/main.c             |  2 +-
> >  arch/x86/kernel/cpu/perf_event_amd.c        |  4 ++--
> >  arch/x86/kernel/cpu/perf_event_amd_uncore.c |  8 ++++----
> >  arch/x86/kernel/fpu/init.c                  |  4 ++--
> >  arch/x86/kernel/hw_breakpoint.c             |  3 ++-
> >  arch/x86/kernel/vm86_32.c                   |  4 +++-
> >  arch/x86/mm/setup_nx.c                      |  4 ++--
> >  drivers/char/hw_random/via-rng.c            |  5 +++--
> >  drivers/crypto/padlock-aes.c                |  2 +-
> >  drivers/crypto/padlock-sha.c                |  3 ++-
> >  fs/btrfs/disk-io.c                          |  2 +-
> >  19 files changed, 35 insertions(+), 54 deletions(-)
> 
> Ok, 0day says this patch makes tiny not so tiny:
> 
> i386-tinyconfig vmlinux size:
> 
> +-------+------+-------+-----+--------------------------------------------------------------------------------------+
> | TOTAL | TEXT | DATA  | BSS |                                                                                      |
> +-------+------+-------+-----+--------------------------------------------------------------------------------------+
> | +4646 |  +64 | +4096 |   0 | ab9976b5af96 x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros       |
> |   -32 |  -32 |     0 |   0 | 13e835020a02 x86/cpufeature: Cleanup get_cpu_cap()                                   |
> |   +32 |  +32 |     0 |   0 | 3615f94f0486 x86/cpufeature: Move some of the scattered feature bits to x86_capabili |
> |  +136 |  +32 |     0 |   0 | 506d983184f4 Merge branch 'tip-fpu-xsave' into rc2+                                  |
> | +4782 |  +96 | +4096 |   0 | ALL COMMITS                                                                          |
> +-------+------+-------+-----+--------------------------------------------------------------------------------------+
> 
> Btw, thanks 0day!

Yay, it worked!  Thanks for paying attention to this.

> The problem comes from static_cpu_has_safe() adding the alternatives and
> fallback machinery. For example, before the patch, we had this at the
> cpu_has_* testing sites:
> 
>         movl    boot_cpu_data+20, %eax  # MEM[(const long unsigned int *)&boot_cpu_data + 20B], D.19113
>         testl   $2097152, %eax  #, D.19113
>         je      .L166   #,
> 
> and now we get this:
> 
> #APP
> # 449 "arch/x86/kernel/cpu/intel.c" 1
> # 0 "" 2
> # 511 "./arch/x86/include/asm/cpufeature.h" 1
>         1: jmp .L166    #
> 2:
> .skip -(((5f-4f) - (2b-1b)) > 0) * ((5f-4f) - (2b-1b)),0x90
> 3:
> .section .altinstructions,"a"
>  .long 1b - .
>  .long 4f - .
>  .word 117      #
>  .byte 3b - 1b
>  .byte 5f - 4f
>  .byte 3b - 2b
> .previous
> .section .altinstr_replacement,"ax"
> 4: jmp .L167    #
> 5:
> .previous
> .section .altinstructions,"a"
>  .long 1b - .
>  .long 0
>  .word 21       #
>  .byte 3b - 1b
>  .byte 0
>  .byte 0
> .previous
> 
> # 0 "" 2
> #NO_APP
>         jmp     .L168   #
> .L166:
>         movl    $21, %eax       #,
>         call    __static_cpu_has_safe   #
>         testb   %al, %al        # D.19126
>         je      .L167   #,
> .L168:
> #APP
> # 453 "arch/x86/kernel/cpu/intel.c" 1
> # 0 "" 2
> #NO_APP
> 
> That gets spread among .altinstructions, .altinstr_replacement, .text
> etc sections. .data grows too probably because of the NOP padding :-\

Yeah, padding makes the evaluation of some section sizes painful.

That said: .data?  I don't quite see how that happened.

> 		   text    data     bss     dec     hex filename
> before:		 644896  127436 1189384 1961716  1deef4 vmlinux
> after:		 645446  131532 1189384 1966362  1e011a vmlinux
> 
> 	[Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
> before:	[12] .altinstructions  PROGBITS        c10bdf48 0bef48 000680 00   A  0   0  1
> after:	[12] .altinstructions  PROGBITS        c10bff48 0c0f48 0007d2 00   A  0   0  1
> 
> before:	[13] .altinstr_replace PROGBITS        c10be5c8 0bf5c8 00016c 00  AX  0   0  1
> after:	[13] .altinstr_replace PROGBITS        c10c071a 0c171a 0001ad 00  AX  0   0  1
> 
> before:	[ 7] .data             PROGBITS        c1092000 093000 0132a0 00  WA  0   0 4096
> after:	[ 7] .data             PROGBITS        c1093000 094000 0142a0 00  WA  0   0 4096
> 
> So I'm wondering if we should make a config option which converts
> static_cpu_has* macros to boot_cpu_has()? That should slim down
> the kernel even more but it won't benefit from the speedup of the
> static_cpu_has* stuff.
> 
> Josh, thoughts?

Seems like a good idea to me: that would sacrifice a small amount of
runtime performance in favor of code size.  (Note that the config option
should use static_cpu_has when =y, and the slower, smaller method when
=n, so that "allnoconfig" can DTRT.)

Given that many embedded systems will know exactly what CPU they want to
run on, I'd also love to see a way to set the capabilities of the CPU at
compile time, so that all those checks (and the code within them) can
constant-fold away.

- Josh Triplett

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros
  2015-11-24 22:42     ` Josh Triplett
@ 2015-11-25  0:10       ` Andy Lutomirski
  2015-11-25  2:58         ` Josh Triplett
  2015-11-27 13:52       ` Borislav Petkov
  1 sibling, 1 reply; 16+ messages in thread
From: Andy Lutomirski @ 2015-11-25  0:10 UTC (permalink / raw)
  To: Josh Triplett
  Cc: Borislav Petkov, X86 ML, LKML, Peter Zijlstra, Herbert Xu,
	Matt Mackall, Chris Mason, Josef Bacik, David Sterba,
	kbuild test robot

On Tue, Nov 24, 2015 at 2:42 PM, Josh Triplett <josh@joshtriplett.org> wrote:
>>                  text    data     bss     dec     hex filename
>> before:                644896  127436 1189384 1961716  1deef4 vmlinux
>> after:                 645446  131532 1189384 1966362  1e011a vmlinux
>>
>>       [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
>> before:       [12] .altinstructions  PROGBITS        c10bdf48 0bef48 000680 00   A  0   0  1
>> after:        [12] .altinstructions  PROGBITS        c10bff48 0c0f48 0007d2 00   A  0   0  1
>>
>> before:       [13] .altinstr_replace PROGBITS        c10be5c8 0bf5c8 00016c 00  AX  0   0  1
>> after:        [13] .altinstr_replace PROGBITS        c10c071a 0c171a 0001ad 00  AX  0   0  1
>>
>> before:       [ 7] .data             PROGBITS        c1092000 093000 0132a0 00  WA  0   0 4096
>> after:        [ 7] .data             PROGBITS        c1093000 094000 0142a0 00  WA  0   0 4096
>>
>> So I'm wondering if we should make a config option which converts
>> static_cpu_has* macros to boot_cpu_has()? That should slim down
>> the kernel even more but it won't benefit from the speedup of the
>> static_cpu_has* stuff.
>>
>> Josh, thoughts?
>
> Seems like a good idea to me: that would sacrifice a small amount of
> runtime performance in favor of code size.  (Note that the config option
> should use static_cpu_has when =y, and the slower, smaller method when
> =n, so that "allnoconfig" can DTRT.)
>
> Given that many embedded systems will know exactly what CPU they want to
> run on, I'd also love to see a way to set the capabilities of the CPU at
> compile time, so that all those checks (and the code within them) can
> constant-fold away.
>

As another idea, the alternatives infrastructure could plausibly be
rearranged so that it never exists in memory in decompressed form.  We
could decompress it streamily and process it as we go.

--Andy

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros
  2015-11-25  0:10       ` Andy Lutomirski
@ 2015-11-25  2:58         ` Josh Triplett
  0 siblings, 0 replies; 16+ messages in thread
From: Josh Triplett @ 2015-11-25  2:58 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Borislav Petkov, X86 ML, LKML, Peter Zijlstra, Herbert Xu,
	Matt Mackall, Chris Mason, Josef Bacik, David Sterba,
	kbuild test robot

On November 24, 2015 4:10:48 PM PST, Andy Lutomirski <luto@amacapital.net> wrote:
>On Tue, Nov 24, 2015 at 2:42 PM, Josh Triplett <josh@joshtriplett.org>
>wrote:
>>>                  text    data     bss     dec     hex filename
>>> before:                644896  127436 1189384 1961716  1deef4
>vmlinux
>>> after:                 645446  131532 1189384 1966362  1e011a
>vmlinux
>>>
>>>       [Nr] Name              Type            Addr     Off    Size  
>ES Flg Lk Inf Al
>>> before:       [12] .altinstructions  PROGBITS        c10bdf48 0bef48
>000680 00   A  0   0  1
>>> after:        [12] .altinstructions  PROGBITS        c10bff48 0c0f48
>0007d2 00   A  0   0  1
>>>
>>> before:       [13] .altinstr_replace PROGBITS        c10be5c8 0bf5c8
>00016c 00  AX  0   0  1
>>> after:        [13] .altinstr_replace PROGBITS        c10c071a 0c171a
>0001ad 00  AX  0   0  1
>>>
>>> before:       [ 7] .data             PROGBITS        c1092000 093000
>0132a0 00  WA  0   0 4096
>>> after:        [ 7] .data             PROGBITS        c1093000 094000
>0142a0 00  WA  0   0 4096
>>>
>>> So I'm wondering if we should make a config option which converts
>>> static_cpu_has* macros to boot_cpu_has()? That should slim down
>>> the kernel even more but it won't benefit from the speedup of the
>>> static_cpu_has* stuff.
>>>
>>> Josh, thoughts?
>>
>> Seems like a good idea to me: that would sacrifice a small amount of
>> runtime performance in favor of code size.  (Note that the config
>option
>> should use static_cpu_has when =y, and the slower, smaller method
>when
>> =n, so that "allnoconfig" can DTRT.)
>>
>> Given that many embedded systems will know exactly what CPU they want
>to
>> run on, I'd also love to see a way to set the capabilities of the CPU
>at
>> compile time, so that all those checks (and the code within them) can
>> constant-fold away.
>>
>
>As another idea, the alternatives infrastructure could plausibly be
>rearranged so that it never exists in memory in decompressed form.  We
>could decompress it streamily and process it as we go.

That doesn't help when running the uncompressed kernel in place, though. It'd be nice if every use of alternatives and similar mechanisms supported build-time resolution.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros
  2015-11-24 22:42     ` Josh Triplett
  2015-11-25  0:10       ` Andy Lutomirski
@ 2015-11-27 13:52       ` Borislav Petkov
  2015-11-27 18:04         ` Borislav Petkov
  1 sibling, 1 reply; 16+ messages in thread
From: Borislav Petkov @ 2015-11-27 13:52 UTC (permalink / raw)
  To: Josh Triplett
  Cc: X86 ML, LKML, Peter Zijlstra, Andy Lutomirski, Herbert Xu,
	Matt Mackall, Chris Mason, Josef Bacik, David Sterba,
	kbuild test robot

On Tue, Nov 24, 2015 at 02:42:11PM -0800, Josh Triplett wrote:
> > So I'm wondering if we should make a config option which converts
> > static_cpu_has* macros to boot_cpu_has()? That should slim down
> > the kernel even more but it won't benefit from the speedup of the
> > static_cpu_has* stuff.
> > 
> > Josh, thoughts?
> 
> Seems like a good idea to me: that would sacrifice a small amount of
> runtime performance in favor of code size.  (Note that the config option
> should use static_cpu_has when =y, and the slower, smaller method when
> =n, so that "allnoconfig" can DTRT.)

Yeah, so first things first.

Concerning the current issue, I went and converted the majority of
macros to use boot_cpu_has() after all. Majority of the paths are not
hot ones but mostly init paths so static_cpu_has_safe() doesn't make any
sense there.

Result is below and the whole rework *actually* slims down tinyconfig
when patches are applied ontop of rc2 + tip/master:


commit							.TEXT 	.DATA 	.BSS
rc2+							650055 	127948 	1189128
0a53df8a1a3a ("x86/cpufeature: Move some of the...") 	649863 	127948 	1189384
ed03a85e6575 ("x86/cpufeature: Cleanup get_cpu_cap()") 	649831 	127948 	1189384
acde56aeda14 ("x86/cpufeature: Remove unused and...")	649831 	127948 	1189384

I'll look at doing the macro thing now, hopefully it doesn't get too ugly.

---
From: Borislav Petkov <bp@suse.de>
Date: Mon, 9 Nov 2015 10:38:45 +0100
Subject: [PATCH] x86/cpufeature: Remove unused and seldomly used cpu_has_xx
 macros

Those are stupid and code should use static_cpu_has_safe() or
boot_cpu_has() instead. Kill the least used and unused ones.

The remaining ones need more careful inspection before a conversion can
happen. On the TODO.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/crypto/chacha20_glue.c             |  2 +-
 arch/x86/crypto/crc32c-intel_glue.c         |  2 +-
 arch/x86/include/asm/cmpxchg_32.h           |  2 +-
 arch/x86/include/asm/cmpxchg_64.h           |  2 +-
 arch/x86/include/asm/cpufeature.h           | 37 ++++-------------------------
 arch/x86/include/asm/xor_32.h               |  2 +-
 arch/x86/kernel/cpu/amd.c                   |  4 ++--
 arch/x86/kernel/cpu/common.c                |  4 +++-
 arch/x86/kernel/cpu/intel.c                 |  3 ++-
 arch/x86/kernel/cpu/intel_cacheinfo.c       |  6 ++---
 arch/x86/kernel/cpu/mtrr/generic.c          |  2 +-
 arch/x86/kernel/cpu/mtrr/main.c             |  2 +-
 arch/x86/kernel/cpu/perf_event_amd.c        |  4 ++--
 arch/x86/kernel/cpu/perf_event_amd_uncore.c | 11 +++++----
 arch/x86/kernel/fpu/init.c                  |  4 ++--
 arch/x86/kernel/hw_breakpoint.c             |  6 +++--
 arch/x86/kernel/smpboot.c                   |  2 +-
 arch/x86/kernel/vm86_32.c                   |  4 +++-
 arch/x86/mm/setup_nx.c                      |  4 ++--
 drivers/char/hw_random/via-rng.c            |  5 ++--
 drivers/crypto/padlock-aes.c                |  2 +-
 drivers/crypto/padlock-sha.c                |  2 +-
 drivers/iommu/intel_irq_remapping.c         |  2 +-
 fs/btrfs/disk-io.c                          |  2 +-
 24 files changed, 48 insertions(+), 68 deletions(-)

diff --git a/arch/x86/crypto/chacha20_glue.c b/arch/x86/crypto/chacha20_glue.c
index 722bacea040e..8baaff5af0b5 100644
--- a/arch/x86/crypto/chacha20_glue.c
+++ b/arch/x86/crypto/chacha20_glue.c
@@ -125,7 +125,7 @@ static struct crypto_alg alg = {
 
 static int __init chacha20_simd_mod_init(void)
 {
-	if (!cpu_has_ssse3)
+	if (!boot_cpu_has(X86_FEATURE_SSSE3))
 		return -ENODEV;
 
 #ifdef CONFIG_AS_AVX2
diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c
index 81a595d75cf5..0e9871693f24 100644
--- a/arch/x86/crypto/crc32c-intel_glue.c
+++ b/arch/x86/crypto/crc32c-intel_glue.c
@@ -257,7 +257,7 @@ static int __init crc32c_intel_mod_init(void)
 	if (!x86_match_cpu(crc32c_cpu_id))
 		return -ENODEV;
 #ifdef CONFIG_X86_64
-	if (cpu_has_pclmulqdq) {
+	if (boot_cpu_has(X86_FEATURE_PCLMULQDQ)) {
 		alg.update = crc32c_pcl_intel_update;
 		alg.finup = crc32c_pcl_intel_finup;
 		alg.digest = crc32c_pcl_intel_digest;
diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h
index f7e142926481..e4959d023af8 100644
--- a/arch/x86/include/asm/cmpxchg_32.h
+++ b/arch/x86/include/asm/cmpxchg_32.h
@@ -109,6 +109,6 @@ static inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new)
 
 #endif
 
-#define system_has_cmpxchg_double() cpu_has_cx8
+#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX8)
 
 #endif /* _ASM_X86_CMPXCHG_32_H */
diff --git a/arch/x86/include/asm/cmpxchg_64.h b/arch/x86/include/asm/cmpxchg_64.h
index 1af94697aae5..caa23a34c963 100644
--- a/arch/x86/include/asm/cmpxchg_64.h
+++ b/arch/x86/include/asm/cmpxchg_64.h
@@ -18,6 +18,6 @@ static inline void set_64bit(volatile u64 *ptr, u64 val)
 	cmpxchg_local((ptr), (o), (n));					\
 })
 
-#define system_has_cmpxchg_double() cpu_has_cx16
+#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX16)
 
 #endif /* _ASM_X86_CMPXCHG_64_H */
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 604f63695d7d..cbe390044a7c 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -386,58 +386,29 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 } while (0)
 
 #define cpu_has_fpu		boot_cpu_has(X86_FEATURE_FPU)
-#define cpu_has_de		boot_cpu_has(X86_FEATURE_DE)
 #define cpu_has_pse		boot_cpu_has(X86_FEATURE_PSE)
 #define cpu_has_tsc		boot_cpu_has(X86_FEATURE_TSC)
 #define cpu_has_pge		boot_cpu_has(X86_FEATURE_PGE)
 #define cpu_has_apic		boot_cpu_has(X86_FEATURE_APIC)
-#define cpu_has_sep		boot_cpu_has(X86_FEATURE_SEP)
-#define cpu_has_mtrr		boot_cpu_has(X86_FEATURE_MTRR)
-#define cpu_has_mmx		boot_cpu_has(X86_FEATURE_MMX)
 #define cpu_has_fxsr		boot_cpu_has(X86_FEATURE_FXSR)
 #define cpu_has_xmm		boot_cpu_has(X86_FEATURE_XMM)
 #define cpu_has_xmm2		boot_cpu_has(X86_FEATURE_XMM2)
-#define cpu_has_xmm3		boot_cpu_has(X86_FEATURE_XMM3)
-#define cpu_has_ssse3		boot_cpu_has(X86_FEATURE_SSSE3)
 #define cpu_has_aes		boot_cpu_has(X86_FEATURE_AES)
 #define cpu_has_avx		boot_cpu_has(X86_FEATURE_AVX)
 #define cpu_has_avx2		boot_cpu_has(X86_FEATURE_AVX2)
-#define cpu_has_ht		boot_cpu_has(X86_FEATURE_HT)
-#define cpu_has_nx		boot_cpu_has(X86_FEATURE_NX)
-#define cpu_has_xstore		boot_cpu_has(X86_FEATURE_XSTORE)
-#define cpu_has_xstore_enabled	boot_cpu_has(X86_FEATURE_XSTORE_EN)
-#define cpu_has_xcrypt		boot_cpu_has(X86_FEATURE_XCRYPT)
-#define cpu_has_xcrypt_enabled	boot_cpu_has(X86_FEATURE_XCRYPT_EN)
-#define cpu_has_ace2		boot_cpu_has(X86_FEATURE_ACE2)
-#define cpu_has_ace2_enabled	boot_cpu_has(X86_FEATURE_ACE2_EN)
-#define cpu_has_phe		boot_cpu_has(X86_FEATURE_PHE)
-#define cpu_has_phe_enabled	boot_cpu_has(X86_FEATURE_PHE_EN)
-#define cpu_has_pmm		boot_cpu_has(X86_FEATURE_PMM)
-#define cpu_has_pmm_enabled	boot_cpu_has(X86_FEATURE_PMM_EN)
-#define cpu_has_ds		boot_cpu_has(X86_FEATURE_DS)
-#define cpu_has_pebs		boot_cpu_has(X86_FEATURE_PEBS)
 #define cpu_has_clflush		boot_cpu_has(X86_FEATURE_CLFLUSH)
-#define cpu_has_bts		boot_cpu_has(X86_FEATURE_BTS)
 #define cpu_has_gbpages		boot_cpu_has(X86_FEATURE_GBPAGES)
 #define cpu_has_arch_perfmon	boot_cpu_has(X86_FEATURE_ARCH_PERFMON)
 #define cpu_has_pat		boot_cpu_has(X86_FEATURE_PAT)
-#define cpu_has_xmm4_1		boot_cpu_has(X86_FEATURE_XMM4_1)
-#define cpu_has_xmm4_2		boot_cpu_has(X86_FEATURE_XMM4_2)
 #define cpu_has_x2apic		boot_cpu_has(X86_FEATURE_X2APIC)
 #define cpu_has_xsave		boot_cpu_has(X86_FEATURE_XSAVE)
-#define cpu_has_xsaveopt	boot_cpu_has(X86_FEATURE_XSAVEOPT)
 #define cpu_has_xsaves		boot_cpu_has(X86_FEATURE_XSAVES)
 #define cpu_has_osxsave		boot_cpu_has(X86_FEATURE_OSXSAVE)
 #define cpu_has_hypervisor	boot_cpu_has(X86_FEATURE_HYPERVISOR)
-#define cpu_has_pclmulqdq	boot_cpu_has(X86_FEATURE_PCLMULQDQ)
-#define cpu_has_perfctr_core	boot_cpu_has(X86_FEATURE_PERFCTR_CORE)
-#define cpu_has_perfctr_nb	boot_cpu_has(X86_FEATURE_PERFCTR_NB)
-#define cpu_has_perfctr_l2	boot_cpu_has(X86_FEATURE_PERFCTR_L2)
-#define cpu_has_cx8		boot_cpu_has(X86_FEATURE_CX8)
-#define cpu_has_cx16		boot_cpu_has(X86_FEATURE_CX16)
-#define cpu_has_eager_fpu	boot_cpu_has(X86_FEATURE_EAGER_FPU)
-#define cpu_has_topoext		boot_cpu_has(X86_FEATURE_TOPOEXT)
-#define cpu_has_bpext		boot_cpu_has(X86_FEATURE_BPEXT)
+/*
+ * Do not add any more of those clumsy macros - use static_cpu_has_safe() for
+ * fast paths and boot_cpu_has() otherwise!
+ */
 
 #if __GNUC__ >= 4
 extern void warn_pre_alternatives(void);
diff --git a/arch/x86/include/asm/xor_32.h b/arch/x86/include/asm/xor_32.h
index 5a08bc8bff33..ccca77dad474 100644
--- a/arch/x86/include/asm/xor_32.h
+++ b/arch/x86/include/asm/xor_32.h
@@ -553,7 +553,7 @@ do {							\
 	if (cpu_has_xmm) {				\
 		xor_speed(&xor_block_pIII_sse);		\
 		xor_speed(&xor_block_sse_pf64);		\
-	} else if (cpu_has_mmx) {			\
+	} else if (static_cpu_has_safe(X86_FEATURE_MMX)) { \
 		xor_speed(&xor_block_pII_mmx);		\
 		xor_speed(&xor_block_p5_mmx);		\
 	} else {					\
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index e229640c19ab..e678ddeed030 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -304,7 +304,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
 	int cpu = smp_processor_id();
 
 	/* get information required for multi-node processors */
-	if (cpu_has_topoext) {
+	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
 		u32 eax, ebx, ecx, edx;
 
 		cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
@@ -922,7 +922,7 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
 
 void set_dr_addr_mask(unsigned long mask, int dr)
 {
-	if (!cpu_has_bpext)
+	if (!boot_cpu_has(X86_FEATURE_BPEXT))
 		return;
 
 	switch (dr) {
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index e72fa2dab911..37830de8f60a 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1440,7 +1440,9 @@ void cpu_init(void)
 
 	printk(KERN_INFO "Initializing CPU#%d\n", cpu);
 
-	if (cpu_feature_enabled(X86_FEATURE_VME) || cpu_has_tsc || cpu_has_de)
+	if (cpu_feature_enabled(X86_FEATURE_VME) ||
+	    cpu_has_tsc ||
+	    boot_cpu_has(X86_FEATURE_DE))
 		cr4_clear_bits(X86_CR4_VME|X86_CR4_PVI|X86_CR4_TSD|X86_CR4_DE);
 
 	load_current_idt();
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 209ac1e7d1f0..565648bc1a0a 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -445,7 +445,8 @@ static void init_intel(struct cpuinfo_x86 *c)
 
 	if (cpu_has_xmm2)
 		set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
-	if (cpu_has_ds) {
+
+	if (boot_cpu_has(X86_FEATURE_DS)) {
 		unsigned int l1;
 		rdmsr(MSR_IA32_MISC_ENABLE, l1, l2);
 		if (!(l1 & (1<<11)))
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index e38d338a6447..0b6c52388cf4 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -591,7 +591,7 @@ cpuid4_cache_lookup_regs(int index, struct _cpuid4_info_regs *this_leaf)
 	unsigned		edx;
 
 	if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) {
-		if (cpu_has_topoext)
+		if (boot_cpu_has(X86_FEATURE_TOPOEXT))
 			cpuid_count(0x8000001d, index, &eax.full,
 				    &ebx.full, &ecx.full, &edx);
 		else
@@ -637,7 +637,7 @@ static int find_num_cache_leaves(struct cpuinfo_x86 *c)
 void init_amd_cacheinfo(struct cpuinfo_x86 *c)
 {
 
-	if (cpu_has_topoext) {
+	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
 		num_cache_leaves = find_num_cache_leaves(c);
 	} else if (c->extended_cpuid_level >= 0x80000006) {
 		if (cpuid_edx(0x80000006) & 0xf000)
@@ -809,7 +809,7 @@ static int __cache_amd_cpumap_setup(unsigned int cpu, int index,
 	struct cacheinfo *this_leaf;
 	int i, sibling;
 
-	if (cpu_has_topoext) {
+	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
 		unsigned int apicid, nshared, first, last;
 
 		this_leaf = this_cpu_ci->info_list + index;
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 3b533cf37c74..8f2ef910c7bf 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -349,7 +349,7 @@ static void get_fixed_ranges(mtrr_type *frs)
 
 void mtrr_save_fixed_ranges(void *info)
 {
-	if (cpu_has_mtrr)
+	if (static_cpu_has_safe(X86_FEATURE_MTRR))
 		get_fixed_ranges(mtrr_state.fixed_ranges);
 }
 
diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index f891b4750f04..5c3d149ee91c 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -682,7 +682,7 @@ void __init mtrr_bp_init(void)
 
 	phys_addr = 32;
 
-	if (cpu_has_mtrr) {
+	if (boot_cpu_has(X86_FEATURE_MTRR)) {
 		mtrr_if = &generic_mtrr_ops;
 		size_or_mask = SIZE_OR_MASK_BITS(36);
 		size_and_mask = 0x00f00000;
diff --git a/arch/x86/kernel/cpu/perf_event_amd.c b/arch/x86/kernel/cpu/perf_event_amd.c
index 1cee5d2d7ece..3ea177cb7366 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -160,7 +160,7 @@ static inline int amd_pmu_addr_offset(int index, bool eventsel)
 	if (offset)
 		return offset;
 
-	if (!cpu_has_perfctr_core)
+	if (!boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
 		offset = index;
 	else
 		offset = index << 1;
@@ -652,7 +652,7 @@ static __initconst const struct x86_pmu amd_pmu = {
 
 static int __init amd_core_pmu_init(void)
 {
-	if (!cpu_has_perfctr_core)
+	if (!boot_cpu_has(X86_FEATURE_PERFCTR_CORE))
 		return 0;
 
 	switch (boot_cpu_data.x86) {
diff --git a/arch/x86/kernel/cpu/perf_event_amd_uncore.c b/arch/x86/kernel/cpu/perf_event_amd_uncore.c
index cc6cedb8f25d..49742746a6c9 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_uncore.c
@@ -523,10 +523,10 @@ static int __init amd_uncore_init(void)
 	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
 		goto fail_nodev;
 
-	if (!cpu_has_topoext)
+	if (!boot_cpu_has(X86_FEATURE_TOPOEXT))
 		goto fail_nodev;
 
-	if (cpu_has_perfctr_nb) {
+	if (boot_cpu_has(X86_FEATURE_PERFCTR_NB)) {
 		amd_uncore_nb = alloc_percpu(struct amd_uncore *);
 		if (!amd_uncore_nb) {
 			ret = -ENOMEM;
@@ -540,7 +540,7 @@ static int __init amd_uncore_init(void)
 		ret = 0;
 	}
 
-	if (cpu_has_perfctr_l2) {
+	if (boot_cpu_has(X86_FEATURE_PERFCTR_L2)) {
 		amd_uncore_l2 = alloc_percpu(struct amd_uncore *);
 		if (!amd_uncore_l2) {
 			ret = -ENOMEM;
@@ -583,10 +583,11 @@ fail_online:
 
 	/* amd_uncore_nb/l2 should have been freed by cleanup_cpu_online */
 	amd_uncore_nb = amd_uncore_l2 = NULL;
-	if (cpu_has_perfctr_l2)
+
+	if (boot_cpu_has(X86_FEATURE_PERFCTR_L2))
 		perf_pmu_unregister(&amd_l2_pmu);
 fail_l2:
-	if (cpu_has_perfctr_nb)
+	if (boot_cpu_has(X86_FEATURE_PERFCTR_NB))
 		perf_pmu_unregister(&amd_nb_pmu);
 	if (amd_uncore_l2)
 		free_percpu(amd_uncore_l2);
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index be39b5fde4b9..22abea04731e 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -12,7 +12,7 @@
  */
 static void fpu__init_cpu_ctx_switch(void)
 {
-	if (!cpu_has_eager_fpu)
+	if (!boot_cpu_has(X86_FEATURE_EAGER_FPU))
 		stts();
 	else
 		clts();
@@ -287,7 +287,7 @@ static void __init fpu__init_system_ctx_switch(void)
 	current_thread_info()->status = 0;
 
 	/* Auto enable eagerfpu for xsaveopt */
-	if (cpu_has_xsaveopt && eagerfpu != DISABLE)
+	if (boot_cpu_has(X86_FEATURE_XSAVEOPT) && eagerfpu != DISABLE)
 		eagerfpu = ENABLE;
 
 	if (xfeatures_mask & XFEATURE_MASK_EAGER) {
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 50a3fad5b89f..2bcfb5f2bc44 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -300,6 +300,10 @@ static int arch_build_bp_info(struct perf_event *bp)
 			return -EINVAL;
 		if (bp->attr.bp_addr & (bp->attr.bp_len - 1))
 			return -EINVAL;
+
+		if (!boot_cpu_has(X86_FEATURE_BPEXT))
+			return -EOPNOTSUPP;
+
 		/*
 		 * It's impossible to use a range breakpoint to fake out
 		 * user vs kernel detection because bp_len - 1 can't
@@ -307,8 +311,6 @@ static int arch_build_bp_info(struct perf_event *bp)
 		 * breakpoints, then we'll have to check for kprobe-blacklisted
 		 * addresses anywhere in the range.
 		 */
-		if (!cpu_has_bpext)
-			return -EOPNOTSUPP;
 		info->mask = bp->attr.bp_len - 1;
 		info->len = X86_BREAKPOINT_LEN_1;
 	}
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index f2281e9cfdbe..24d57f77b3c1 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -304,7 +304,7 @@ do {									\
 
 static bool match_smt(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
 {
-	if (cpu_has_topoext) {
+	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
 		int cpu1 = c->cpu_index, cpu2 = o->cpu_index;
 
 		if (c->phys_proc_id == o->phys_proc_id &&
diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c
index 524619351961..483231ebbb0b 100644
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -357,8 +357,10 @@ static long do_sys_vm86(struct vm86plus_struct __user *user_vm86, bool plus)
 	tss = &per_cpu(cpu_tss, get_cpu());
 	/* make room for real-mode segments */
 	tsk->thread.sp0 += 16;
-	if (cpu_has_sep)
+
+	if (static_cpu_has_safe(X86_FEATURE_SEP))
 		tsk->thread.sysenter_cs = 0;
+
 	load_sp0(tss, &tsk->thread);
 	put_cpu();
 
diff --git a/arch/x86/mm/setup_nx.c b/arch/x86/mm/setup_nx.c
index 90555bf60aa4..92e2eacb3321 100644
--- a/arch/x86/mm/setup_nx.c
+++ b/arch/x86/mm/setup_nx.c
@@ -31,7 +31,7 @@ early_param("noexec", noexec_setup);
 
 void x86_configure_nx(void)
 {
-	if (cpu_has_nx && !disable_nx)
+	if (boot_cpu_has(X86_FEATURE_NX) && !disable_nx)
 		__supported_pte_mask |= _PAGE_NX;
 	else
 		__supported_pte_mask &= ~_PAGE_NX;
@@ -39,7 +39,7 @@ void x86_configure_nx(void)
 
 void __init x86_report_nx(void)
 {
-	if (!cpu_has_nx) {
+	if (!boot_cpu_has(X86_FEATURE_NX)) {
 		printk(KERN_NOTICE "Notice: NX (Execute Disable) protection "
 		       "missing in CPU!\n");
 	} else {
diff --git a/drivers/char/hw_random/via-rng.c b/drivers/char/hw_random/via-rng.c
index 0c98a9d51a24..44ce80606944 100644
--- a/drivers/char/hw_random/via-rng.c
+++ b/drivers/char/hw_random/via-rng.c
@@ -140,7 +140,7 @@ static int via_rng_init(struct hwrng *rng)
 	 * RNG configuration like it used to be the case in this
 	 * register */
 	if ((c->x86 == 6) && (c->x86_model >= 0x0f)) {
-		if (!cpu_has_xstore_enabled) {
+		if (!boot_cpu_has(X86_FEATURE_XSTORE_EN)) {
 			pr_err(PFX "can't enable hardware RNG "
 				"if XSTORE is not enabled\n");
 			return -ENODEV;
@@ -200,8 +200,9 @@ static int __init mod_init(void)
 {
 	int err;
 
-	if (!cpu_has_xstore)
+	if (!boot_cpu_has(X86_FEATURE_XSTORE))
 		return -ENODEV;
+
 	pr_info("VIA RNG detected\n");
 	err = hwrng_register(&via_rng);
 	if (err) {
diff --git a/drivers/crypto/padlock-aes.c b/drivers/crypto/padlock-aes.c
index da2d6777bd09..97a364694bfc 100644
--- a/drivers/crypto/padlock-aes.c
+++ b/drivers/crypto/padlock-aes.c
@@ -515,7 +515,7 @@ static int __init padlock_init(void)
 	if (!x86_match_cpu(padlock_cpu_id))
 		return -ENODEV;
 
-	if (!cpu_has_xcrypt_enabled) {
+	if (!boot_cpu_has(X86_FEATURE_XCRYPT_EN)) {
 		printk(KERN_NOTICE PFX "VIA PadLock detected, but not enabled. Hmm, strange...\n");
 		return -ENODEV;
 	}
diff --git a/drivers/crypto/padlock-sha.c b/drivers/crypto/padlock-sha.c
index 4e154c9b9206..8c5f90647b7a 100644
--- a/drivers/crypto/padlock-sha.c
+++ b/drivers/crypto/padlock-sha.c
@@ -540,7 +540,7 @@ static int __init padlock_init(void)
 	struct shash_alg *sha1;
 	struct shash_alg *sha256;
 
-	if (!x86_match_cpu(padlock_sha_ids) || !cpu_has_phe_enabled)
+	if (!x86_match_cpu(padlock_sha_ids) || !boot_cpu_has(X86_FEATURE_PHE_EN))
 		return -ENODEV;
 
 	/* Register the newly added algorithm module if on *
diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 1fae1881648c..c12ba4516df2 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -753,7 +753,7 @@ static inline void set_irq_posting_cap(void)
 		 * should have X86_FEATURE_CX16 support, this has been confirmed
 		 * with Intel hardware guys.
 		 */
-		if ( cpu_has_cx16 )
+		if (boot_cpu_has(X86_FEATURE_CX16))
 			intel_irq_remap_ops.capability |= 1 << IRQ_POSTING_CAP;
 
 		for_each_iommu(iommu, drhd)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 974be09e7556..42a378a4eefb 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -923,7 +923,7 @@ static int check_async_write(struct inode *inode, unsigned long bio_flags)
 	if (bio_flags & EXTENT_BIO_TREE_LOG)
 		return 0;
 #ifdef CONFIG_X86
-	if (cpu_has_xmm4_2)
+	if (static_cpu_has_safe(X86_FEATURE_XMM4_2))
 		return 0;
 #endif
 	return 1;
-- 
2.3.5

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros
  2015-11-27 13:52       ` Borislav Petkov
@ 2015-11-27 18:04         ` Borislav Petkov
  2015-11-27 20:13           ` Josh Triplett
  0 siblings, 1 reply; 16+ messages in thread
From: Borislav Petkov @ 2015-11-27 18:04 UTC (permalink / raw)
  To: Josh Triplett
  Cc: X86 ML, LKML, Peter Zijlstra, Andy Lutomirski, Herbert Xu,
	Matt Mackall, Chris Mason, Josef Bacik, David Sterba,
	kbuild test robot

On Fri, Nov 27, 2015 at 02:52:57PM +0100, Borislav Petkov wrote:
> commit						.TEXT 	.DATA 	.BSS
> rc2+							650055 	127948 	1189128
> 0a53df8a1a3a ("x86/cpufeature: Move some of the...") 	649863 	127948 	1189384
> ed03a85e6575 ("x86/cpufeature: Cleanup get_cpu_cap()") 	649831 	127948 	1189384
> acde56aeda14 ("x86/cpufeature: Remove unused and...")	649831 	127948 	1189384
> 
> I'll look at doing the macro thing now, hopefully it doesn't get too ugly.

Yeah, we do save us some ~1.6K text (cf numbers above) for the price
of a bit slower feature bit testing. Don't know if it matters at all,
though:

commit							.TEXT 	.DATA 	.BSS
CONFIG_X86_FAST_FEATURE_TESTS				648209	127948	1189384

and diff looks pretty simple:

---
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4a9b9a9a1a64..ff64585ea0bf 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -350,6 +350,10 @@ config X86_FEATURE_NAMES
 
 	  If in doubt, say Y.
 
+config X86_FAST_FEATURE_TESTS
+	bool "Fast feature tests" if EMBEDDED
+	default y
+
 config X86_X2APIC
 	bool "Support x2apic"
 	depends on X86_LOCAL_APIC && X86_64 && (IRQ_REMAP || HYPERVISOR_GUEST)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index cbe390044a7c..7ad8c9464297 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -410,7 +410,7 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
  * fast paths and boot_cpu_has() otherwise!
  */
 
-#if __GNUC__ >= 4
+#if __GNUC__ >= 4 && defined(CONFIG_X86_FAST_FEATURE_TESTS)
 extern void warn_pre_alternatives(void);
 extern bool __static_cpu_has_safe(u16 bit);

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros
  2015-11-27 18:04         ` Borislav Petkov
@ 2015-11-27 20:13           ` Josh Triplett
  2015-11-27 20:23             ` Borislav Petkov
  0 siblings, 1 reply; 16+ messages in thread
From: Josh Triplett @ 2015-11-27 20:13 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: X86 ML, LKML, Peter Zijlstra, Andy Lutomirski, Herbert Xu,
	Matt Mackall, Chris Mason, Josef Bacik, David Sterba,
	kbuild test robot

On Fri, Nov 27, 2015 at 07:04:33PM +0100, Borislav Petkov wrote:
> On Fri, Nov 27, 2015 at 02:52:57PM +0100, Borislav Petkov wrote:
> > commit						.TEXT 	.DATA 	.BSS
> > rc2+							650055 	127948 	1189128
> > 0a53df8a1a3a ("x86/cpufeature: Move some of the...") 	649863 	127948 	1189384
> > ed03a85e6575 ("x86/cpufeature: Cleanup get_cpu_cap()") 	649831 	127948 	1189384
> > acde56aeda14 ("x86/cpufeature: Remove unused and...")	649831 	127948 	1189384
> > 
> > I'll look at doing the macro thing now, hopefully it doesn't get too ugly.
> 
> Yeah, we do save us some ~1.6K text (cf numbers above) for the price
> of a bit slower feature bit testing. Don't know if it matters at all,
> though:
> 
> commit							.TEXT 	.DATA 	.BSS
> CONFIG_X86_FAST_FEATURE_TESTS				648209	127948	1189384
> 
> and diff looks pretty simple:

Given an appropriate long description for that config option, that seems
worthwhile.  Something like this:

Some fast-paths in the kernel depend on the capabilities of the CPU.
Say Y here for the kernel to patch in the appropriate code at runtime
based on the capabilities of the CPU.  The infrastructure for patching
code at runtime takes up some additional space; space-constrained
embedded systems may wish to say N here to produce smaller, slightly
slower code.

> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 4a9b9a9a1a64..ff64585ea0bf 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -350,6 +350,10 @@ config X86_FEATURE_NAMES
>  
>  	  If in doubt, say Y.
>  
> +config X86_FAST_FEATURE_TESTS
> +	bool "Fast feature tests" if EMBEDDED
> +	default y
> +
>  config X86_X2APIC
>  	bool "Support x2apic"
>  	depends on X86_LOCAL_APIC && X86_64 && (IRQ_REMAP || HYPERVISOR_GUEST)
> diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> index cbe390044a7c..7ad8c9464297 100644
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -410,7 +410,7 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
>   * fast paths and boot_cpu_has() otherwise!
>   */
>  
> -#if __GNUC__ >= 4
> +#if __GNUC__ >= 4 && defined(CONFIG_X86_FAST_FEATURE_TESTS)
>  extern void warn_pre_alternatives(void);
>  extern bool __static_cpu_has_safe(u16 bit);
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros
  2015-11-27 20:13           ` Josh Triplett
@ 2015-11-27 20:23             ` Borislav Petkov
  0 siblings, 0 replies; 16+ messages in thread
From: Borislav Petkov @ 2015-11-27 20:23 UTC (permalink / raw)
  To: Josh Triplett
  Cc: X86 ML, LKML, Peter Zijlstra, Andy Lutomirski, Herbert Xu,
	Matt Mackall, Chris Mason, Josef Bacik, David Sterba,
	kbuild test robot

On Fri, Nov 27, 2015 at 12:13:55PM -0800, Josh Triplett wrote:
> Given an appropriate long description for that config option, that seems
> worthwhile.  Something like this:
> 
> Some fast-paths in the kernel depend on the capabilities of the CPU.
> Say Y here for the kernel to patch in the appropriate code at runtime
> based on the capabilities of the CPU.  The infrastructure for patching
> code at runtime takes up some additional space; space-constrained
> embedded systems may wish to say N here to produce smaller, slightly
> slower code.

Thanks for the text, looks good and I'll use it. :)

And yes, considering the size of the patch, it is really worthwhile to
save ~1.6K kernel text that easily.

I'll do a proper patch and run it through the build tests.

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-11-27 20:23 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-10 11:48 [RFC PATCH 0/3] x86/cpufeature: Cleanup stuff Borislav Petkov
2015-11-10 11:48 ` [RFC PATCH 1/3] x86/cpufeature: Move some of the scattered feature bits to x86_capability Borislav Petkov
2015-11-10 11:48 ` [RFC PATCH 2/3] x86/cpufeature: Cleanup get_cpu_cap() Borislav Petkov
2015-11-10 11:48 ` [RFC PATCH 3/3] x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros Borislav Petkov
2015-11-10 11:57   ` David Sterba
2015-11-10 12:30   ` Ingo Molnar
2015-11-10 12:37     ` Borislav Petkov
2015-11-18 18:23     ` Borislav Petkov
2015-11-24 13:05   ` Borislav Petkov
2015-11-24 22:42     ` Josh Triplett
2015-11-25  0:10       ` Andy Lutomirski
2015-11-25  2:58         ` Josh Triplett
2015-11-27 13:52       ` Borislav Petkov
2015-11-27 18:04         ` Borislav Petkov
2015-11-27 20:13           ` Josh Triplett
2015-11-27 20:23             ` Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.