linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] x86/delay: Introduce TPAUSE instruction
@ 2020-04-10 23:29 Kyung Min Park
  2020-04-10 23:29 ` [PATCH v3 1/3] x86/delay: Preparatory code cleanup Kyung Min Park
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Kyung Min Park @ 2020-04-10 23:29 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: tglx, mingo, hpa, gregkh, ak, tony.luck, ashok.raj,
	ravi.v.shankar, fenghua.yu, kyung.min.park

Intel processors that support the WAITPKG feature implement
the TPAUSE instruction that suspends execution in a lower power
state until the TSC (Time Stamp Counter) exceeds a certain value.

Update the udelay() function to use TPAUSE on systems where it
is available. Note that we hard code the deeper (C0.2) sleep
state because exit latency is small compared to the "microseconds"
that usleep() will delay.

ChangeLog:
- Change from v2 to v3:
  1  Add Thomas' cleanup patch to this patchset.
  2. Implement use_tpause_delay() to use TPAUSE.
  3. Call use_tpause_delay() during x86_late_time_init().
  4. Use APIs lower_32_bits(), upper_32_bits() as suggested by Joe Perch.
  5. Change __tpause() argument integer type from unsigned int to u32.

- Change from v1 to v2:
  1. Change function/variable names as suggested by Thomas i.e.
     a. Change to delay_halt_fn/delay_halt_mwaitx/delay_halt_tpause from
        wait_func/mwaitx/tpause.
     b. Change variable name loops to cycles.
     c. Change back to the original name delay_fn from delay_platform.
  2. Organize comments to use full width.
  3. Add __ro_after_init for the function pointer delay_halt_fn.
  4. Change patch titles as suggested by Thomas.

Kyung Min Park (2):
  x86/delay: Refactor delay_mwaitx() for TPAUSE support
  x86/delay: Introduce TPAUSE delay

Thomas Gleixner (1):
  x86/delay: Preparatory code cleanup

 arch/x86/include/asm/delay.h |   3 +-
 arch/x86/include/asm/mwait.h |  18 ++++++-
 arch/x86/kernel/time.c       |   3 ++
 arch/x86/lib/delay.c         | 114 +++++++++++++++++++++++++++++--------------
 4 files changed, 100 insertions(+), 38 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/3] x86/delay: Preparatory code cleanup
  2020-04-10 23:29 [PATCH v3 0/3] x86/delay: Introduce TPAUSE instruction Kyung Min Park
@ 2020-04-10 23:29 ` Kyung Min Park
  2020-04-13  3:26   ` kbuild test robot
  2020-04-10 23:29 ` [PATCH v3 2/3] x86/delay: Refactor delay_mwaitx() for TPAUSE support Kyung Min Park
  2020-04-10 23:29 ` [PATCH v3 3/3] x86/delay: Introduce TPAUSE delay Kyung Min Park
  2 siblings, 1 reply; 10+ messages in thread
From: Kyung Min Park @ 2020-04-10 23:29 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: tglx, mingo, hpa, gregkh, ak, tony.luck, ashok.raj,
	ravi.v.shankar, fenghua.yu, kyung.min.park

From: Thomas Gleixner <tglx@linutronix.de>

From: Thomas Gleixner <tglx@linutronix.de>

The naming conventions in the delay code are confusing at best.

All delay variants use a loops argument and or variable which originates
from the original delay_loop() implementation. But all variants except
delay_loop() are based on TSC cycles.

Rename the argument to cycles and make it type u64 to avoid these weird
expansions to u64 in the functions.

Rename MWAITX_MAX_LOOPS to MWAITX_MAX_WAIT_CYCLES for the same reason
and fixup the comment of delay_mwaitx() as well.

Mark the delay_fn function pointer __ro_after_init and fixup the comment
for it.

No functional change and preparation for the upcoming TPAUSE based delay
variant.

[Kyung Min Park: Added __init to use_tsc_delay()]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
---
 arch/x86/include/asm/delay.h |  2 +-
 arch/x86/include/asm/mwait.h |  2 +-
 arch/x86/lib/delay.c         | 45 +++++++++++++++++++++++---------------------
 3 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/delay.h b/arch/x86/include/asm/delay.h
index de9e784..bb91d7c 100644
--- a/arch/x86/include/asm/delay.h
+++ b/arch/x86/include/asm/delay.h
@@ -4,7 +4,7 @@
 
 #include <asm-generic/delay.h>
 
-void use_tsc_delay(void);
+void __init use_tsc_delay(void);
 void use_mwaitx_delay(void);
 
 #endif /* _ASM_X86_DELAY_H */
diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index b809f11..a43b35b 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -20,7 +20,7 @@
 
 #define MWAIT_ECX_INTERRUPT_BREAK	0x1
 #define MWAITX_ECX_TIMER_ENABLE		BIT(1)
-#define MWAITX_MAX_LOOPS		((u32)-1)
+#define MWAITX_MAX_WAIT_CYCLES		UINT_MAX
 #define MWAITX_DISABLE_CSTATES		0xf0
 
 u32 get_umwait_control_msr(void);
diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index c126571..887d52d 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -27,9 +27,19 @@
 # include <asm/smp.h>
 #endif
 
+static void delay_loop(u64 __loops);
+
+/*
+ * Calibration and selection of the delay mechanism happens only once
+ * during boot.
+ */
+static void (*delay_fn)(u64) __ro_after_init = delay_loop;
+
 /* simple loop based delay: */
-static void delay_loop(unsigned long loops)
+static void delay_loop(u64 __loops)
 {
+	unsigned long loops = (unsigned long)__loops;
+
 	asm volatile(
 		"	test %0,%0	\n"
 		"	jz 3f		\n"
@@ -49,9 +59,9 @@ static void delay_loop(unsigned long loops)
 }
 
 /* TSC based delay: */
-static void delay_tsc(unsigned long __loops)
+static void delay_tsc(u64 cycles)
 {
-	u64 bclock, now, loops = __loops;
+	u64 bclock, now;
 	int cpu;
 
 	preempt_disable();
@@ -59,7 +69,7 @@ static void delay_tsc(unsigned long __loops)
 	bclock = rdtsc_ordered();
 	for (;;) {
 		now = rdtsc_ordered();
-		if ((now - bclock) >= loops)
+		if ((now - bclock) >= cycles)
 			break;
 
 		/* Allow RT tasks to run */
@@ -77,7 +87,7 @@ static void delay_tsc(unsigned long __loops)
 		 * counter for this CPU.
 		 */
 		if (unlikely(cpu != smp_processor_id())) {
-			loops -= (now - bclock);
+			cycles -= (now - bclock);
 			cpu = smp_processor_id();
 			bclock = rdtsc_ordered();
 		}
@@ -87,24 +97,24 @@ static void delay_tsc(unsigned long __loops)
 
 /*
  * On some AMD platforms, MWAITX has a configurable 32-bit timer, that
- * counts with TSC frequency. The input value is the loop of the
- * counter, it will exit when the timer expires.
+ * counts with TSC frequency. The input value is the number of TSC cycles
+ * to wait. MWAITX will also exit when the timer expires.
  */
-static void delay_mwaitx(unsigned long __loops)
+static void delay_mwaitx(u64 cycles)
 {
-	u64 start, end, delay, loops = __loops;
+	u64 start, end, delay;
 
 	/*
 	 * Timer value of 0 causes MWAITX to wait indefinitely, unless there
 	 * is a store on the memory monitored by MONITORX.
 	 */
-	if (loops == 0)
+	if (!cycles)
 		return;
 
 	start = rdtsc_ordered();
 
 	for (;;) {
-		delay = min_t(u64, MWAITX_MAX_LOOPS, loops);
+		delay = min_t(u64, MWAITX_MAX_WAIT_CYCLES, cycles);
 
 		/*
 		 * Use cpu_tss_rw as a cacheline-aligned, seldomly
@@ -121,22 +131,15 @@ static void delay_mwaitx(unsigned long __loops)
 
 		end = rdtsc_ordered();
 
-		if (loops <= end - start)
+		if (cycles <= end - start)
 			break;
 
-		loops -= end - start;
-
+		cycles -= end - start;
 		start = end;
 	}
 }
 
-/*
- * Since we calibrate only once at boot, this
- * function should be set once at boot and not changed
- */
-static void (*delay_fn)(unsigned long) = delay_loop;
-
-void use_tsc_delay(void)
+void __init use_tsc_delay(void)
 {
 	if (delay_fn == delay_loop)
 		delay_fn = delay_tsc;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 2/3] x86/delay: Refactor delay_mwaitx() for TPAUSE support
  2020-04-10 23:29 [PATCH v3 0/3] x86/delay: Introduce TPAUSE instruction Kyung Min Park
  2020-04-10 23:29 ` [PATCH v3 1/3] x86/delay: Preparatory code cleanup Kyung Min Park
@ 2020-04-10 23:29 ` Kyung Min Park
  2020-04-10 23:29 ` [PATCH v3 3/3] x86/delay: Introduce TPAUSE delay Kyung Min Park
  2 siblings, 0 replies; 10+ messages in thread
From: Kyung Min Park @ 2020-04-10 23:29 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: tglx, mingo, hpa, gregkh, ak, tony.luck, ashok.raj,
	ravi.v.shankar, fenghua.yu, kyung.min.park

Refactor code to make it easier to add a new model specific function to
delay for a number of cycles.

No functional change.

Reviewed-by: Tony Luck <tony.luck@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
---
 arch/x86/lib/delay.c | 48 ++++++++++++++++++++++++++++++------------------
 1 file changed, 30 insertions(+), 18 deletions(-)

diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index 887d52d..fe91dc1 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -34,6 +34,7 @@ static void delay_loop(u64 __loops);
  * during boot.
  */
 static void (*delay_fn)(u64) __ro_after_init = delay_loop;
+static void (*delay_halt_fn)(u64 start, u64 cycles) __ro_after_init;
 
 /* simple loop based delay: */
 static void delay_loop(u64 __loops)
@@ -100,9 +101,33 @@ static void delay_tsc(u64 cycles)
  * counts with TSC frequency. The input value is the number of TSC cycles
  * to wait. MWAITX will also exit when the timer expires.
  */
-static void delay_mwaitx(u64 cycles)
+static void delay_halt_mwaitx(u64 unused, u64 cycles)
 {
-	u64 start, end, delay;
+	u64 delay;
+
+	delay = min_t(u64, MWAITX_MAX_WAIT_CYCLES, cycles);
+	/*
+	 * Use cpu_tss_rw as a cacheline-aligned, seldomly accessed per-cpu
+	 * variable as the monitor target.
+	 */
+	 __monitorx(raw_cpu_ptr(&cpu_tss_rw), 0, 0);
+
+	/*
+	 * AMD, like Intel, supports the EAX hint and EAX=0xf means, do not
+	 * enter any deep C-state and we use it here in delay() to minimize
+	 * wakeup latency.
+	 */
+	__mwaitx(MWAITX_DISABLE_CSTATES, delay, MWAITX_ECX_TIMER_ENABLE);
+}
+
+/*
+ * Call a vendor specific function to delay for a given amount of time. Because
+ * these functions may return earlier than requested, check for actual elapsed
+ * time and call again until done.
+ */
+static void delay_halt(u64 __cycles)
+{
+	u64 start, end, cycles = __cycles;
 
 	/*
 	 * Timer value of 0 causes MWAITX to wait indefinitely, unless there
@@ -114,21 +139,7 @@ static void delay_mwaitx(u64 cycles)
 	start = rdtsc_ordered();
 
 	for (;;) {
-		delay = min_t(u64, MWAITX_MAX_WAIT_CYCLES, cycles);
-
-		/*
-		 * Use cpu_tss_rw as a cacheline-aligned, seldomly
-		 * accessed per-cpu variable as the monitor target.
-		 */
-		__monitorx(raw_cpu_ptr(&cpu_tss_rw), 0, 0);
-
-		/*
-		 * AMD, like Intel's MWAIT version, supports the EAX hint and
-		 * EAX=0xf0 means, do not enter any deep C-state and we use it
-		 * here in delay() to minimize wakeup latency.
-		 */
-		__mwaitx(MWAITX_DISABLE_CSTATES, delay, MWAITX_ECX_TIMER_ENABLE);
-
+		delay_halt_fn(start, cycles);
 		end = rdtsc_ordered();
 
 		if (cycles <= end - start)
@@ -147,7 +158,8 @@ void __init use_tsc_delay(void)
 
 void use_mwaitx_delay(void)
 {
-	delay_fn = delay_mwaitx;
+	delay_halt_fn = delay_halt_mwaitx;
+	delay_fn = delay_halt;
 }
 
 int read_current_timer(unsigned long *timer_val)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 3/3] x86/delay: Introduce TPAUSE delay
  2020-04-10 23:29 [PATCH v3 0/3] x86/delay: Introduce TPAUSE instruction Kyung Min Park
  2020-04-10 23:29 ` [PATCH v3 1/3] x86/delay: Preparatory code cleanup Kyung Min Park
  2020-04-10 23:29 ` [PATCH v3 2/3] x86/delay: Refactor delay_mwaitx() for TPAUSE support Kyung Min Park
@ 2020-04-10 23:29 ` Kyung Min Park
  2020-04-13  5:42   ` kbuild test robot
  2020-04-14 10:31   ` Peter Zijlstra
  2 siblings, 2 replies; 10+ messages in thread
From: Kyung Min Park @ 2020-04-10 23:29 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: tglx, mingo, hpa, gregkh, ak, tony.luck, ashok.raj,
	ravi.v.shankar, fenghua.yu, kyung.min.park

TPAUSE instructs the processor to enter an implementation-dependent
optimized state. The instruction execution wakes up when the time-stamp
counter reaches or exceeds the implicit EDX:EAX 64-bit input value.
The instruction execution also wakes up due to the expiration of
the operating system time-limit or by an external interrupt
or exceptions such as a debug exception or a machine check exception.

TPAUSE offers a choice of two lower power states:
 1. Light-weight power/performance optimized state C0.1
 2. Improved power/performance optimized state C0.2
This way, it can save power with low wake-up latency in comparison to
spinloop based delay. The selection between the two is governed by the
input register.

TPAUSE is available on processors with X86_FEATURE_WAITPKG.

Reviewed-by: Tony Luck <tony.luck@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
---
 arch/x86/include/asm/delay.h |  1 +
 arch/x86/include/asm/mwait.h | 16 ++++++++++++++++
 arch/x86/kernel/time.c       |  3 +++
 arch/x86/lib/delay.c         | 27 +++++++++++++++++++++++++++
 4 files changed, 47 insertions(+)

diff --git a/arch/x86/include/asm/delay.h b/arch/x86/include/asm/delay.h
index bb91d7c..b57048f 100644
--- a/arch/x86/include/asm/delay.h
+++ b/arch/x86/include/asm/delay.h
@@ -5,6 +5,7 @@
 #include <asm-generic/delay.h>
 
 void __init use_tsc_delay(void);
+void __init use_tpause_delay(void);
 void use_mwaitx_delay(void);
 
 #endif /* _ASM_X86_DELAY_H */
diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index a43b35b..b00596b 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -22,6 +22,8 @@
 #define MWAITX_ECX_TIMER_ENABLE		BIT(1)
 #define MWAITX_MAX_WAIT_CYCLES		UINT_MAX
 #define MWAITX_DISABLE_CSTATES		0xf0
+#define TPAUSE_C01_STATE		1
+#define TPAUSE_C02_STATE		0
 
 u32 get_umwait_control_msr(void);
 
@@ -122,4 +124,18 @@ static inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx)
 	current_clr_polling();
 }
 
+/*
+ * Caller can specify whether to enter C0.1 (low latency, less
+ * power saving) or C0.2 state (saves more power, but longer wakeup
+ * latency). This may be overridden by the IA32_UMWAIT_CONTROL MSR
+ * which can force requests for C0.2 to be downgraded to C0.1.
+ */
+static inline void __tpause(u32 ecx, u32 edx, u32 eax)
+{
+	/* "tpause %ecx, %edx, %eax;" */
+	asm volatile(".byte 0x66, 0x0f, 0xae, 0xf1\t\n"
+		     :
+		     : "c"(ecx), "d"(edx), "a"(eax));
+}
+
 #endif /* _ASM_X86_MWAIT_H */
diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c
index 106e7f8..371a6b3 100644
--- a/arch/x86/kernel/time.c
+++ b/arch/x86/kernel/time.c
@@ -103,6 +103,9 @@ static __init void x86_late_time_init(void)
 	 */
 	x86_init.irqs.intr_mode_init();
 	tsc_init();
+
+	if (static_cpu_has(X86_FEATURE_WAITPKG))
+		use_tpause_delay();
 }
 
 /*
diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index fe91dc1..65d15df 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -97,6 +97,27 @@ static void delay_tsc(u64 cycles)
 }
 
 /*
+ * On Intel the TPAUSE instruction waits until any of:
+ * 1) the TSC counter exceeds the value provided in EDX:EAX
+ * 2) global timeout in IA32_UMWAIT_CONTROL is exceeded
+ * 3) an external interrupt occurs
+ */
+static void delay_halt_tpause(u64 start, u64 cycles)
+{
+	u64 until = start + cycles;
+	u32 eax, edx;
+
+	eax = lower_32_bits(until);
+	edx = upper_32_bits(until);
+
+	/*
+	 * Hard code the deeper (C0.2) sleep state because exit latency is
+	 * small compared to the "microseconds" that usleep() will delay.
+	 */
+	__tpause(TPAUSE_C02_STATE, edx, eax);
+}
+
+/*
  * On some AMD platforms, MWAITX has a configurable 32-bit timer, that
  * counts with TSC frequency. The input value is the number of TSC cycles
  * to wait. MWAITX will also exit when the timer expires.
@@ -156,6 +177,12 @@ void __init use_tsc_delay(void)
 		delay_fn = delay_tsc;
 }
 
+void __init use_tpause_delay(void)
+{
+	delay_halt_fn = delay_halt_tpause;
+	delay_fn = delay_halt;
+}
+
 void use_mwaitx_delay(void)
 {
 	delay_halt_fn = delay_halt_mwaitx;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] x86/delay: Preparatory code cleanup
  2020-04-10 23:29 ` [PATCH v3 1/3] x86/delay: Preparatory code cleanup Kyung Min Park
@ 2020-04-13  3:26   ` kbuild test robot
  0 siblings, 0 replies; 10+ messages in thread
From: kbuild test robot @ 2020-04-13  3:26 UTC (permalink / raw)
  To: Kyung Min Park; +Cc: kbuild-all, x86, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1484 bytes --]

Hi Kyung,

I love your patch! Yet something to improve:

[auto build test ERROR on tip/auto-latest]
[also build test ERROR on linus/master tip/x86/core linux/master v5.7-rc1 next-20200412]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Kyung-Min-Park/x86-delay-Introduce-TPAUSE-instruction/20200411-073203
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 1d625b673f4477a92277a2291be07fa82519ce15
config: i386-debian-10.3 (attached as .config)
compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from arch/x86/kernel/reboot_fixups_32.c:11:0:
>> arch/x86/include/asm/delay.h:7:13: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'use_tsc_delay'
    void __init use_tsc_delay(void);
                ^~~~~~~~~~~~~

vim +7 arch/x86/include/asm/delay.h

     6	
   > 7	void __init use_tsc_delay(void);
     8	void use_mwaitx_delay(void);
     9	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35142 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 3/3] x86/delay: Introduce TPAUSE delay
  2020-04-10 23:29 ` [PATCH v3 3/3] x86/delay: Introduce TPAUSE delay Kyung Min Park
@ 2020-04-13  5:42   ` kbuild test robot
  2020-04-14 10:31   ` Peter Zijlstra
  1 sibling, 0 replies; 10+ messages in thread
From: kbuild test robot @ 2020-04-13  5:42 UTC (permalink / raw)
  To: Kyung Min Park; +Cc: kbuild-all, x86, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1726 bytes --]

Hi Kyung,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/auto-latest]
[also build test ERROR on linus/master tip/x86/core linux/master v5.7-rc1 next-20200412]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Kyung-Min-Park/x86-delay-Introduce-TPAUSE-instruction/20200411-073203
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 1d625b673f4477a92277a2291be07fa82519ce15
config: i386-debian-10.3 (attached as .config)
compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from arch/x86/kernel/reboot_fixups_32.c:11:0:
   arch/x86/include/asm/delay.h:7:13: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'use_tsc_delay'
    void __init use_tsc_delay(void);
                ^~~~~~~~~~~~~
>> arch/x86/include/asm/delay.h:8:13: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'use_tpause_delay'
    void __init use_tpause_delay(void);
                ^~~~~~~~~~~~~~~~

vim +8 arch/x86/include/asm/delay.h

     6	
   > 7	void __init use_tsc_delay(void);
   > 8	void __init use_tpause_delay(void);
     9	void use_mwaitx_delay(void);
    10	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35142 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 3/3] x86/delay: Introduce TPAUSE delay
  2020-04-10 23:29 ` [PATCH v3 3/3] x86/delay: Introduce TPAUSE delay Kyung Min Park
  2020-04-13  5:42   ` kbuild test robot
@ 2020-04-14 10:31   ` Peter Zijlstra
  2020-04-14 18:00     ` Luck, Tony
  1 sibling, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2020-04-14 10:31 UTC (permalink / raw)
  To: Kyung Min Park
  Cc: x86, linux-kernel, tglx, mingo, hpa, gregkh, ak, tony.luck,
	ashok.raj, ravi.v.shankar, fenghua.yu

On Fri, Apr 10, 2020 at 04:29:55PM -0700, Kyung Min Park wrote:
> +static inline void __tpause(u32 ecx, u32 edx, u32 eax)
> +{
> +	/* "tpause %ecx, %edx, %eax;" */
> +	asm volatile(".byte 0x66, 0x0f, 0xae, 0xf1\t\n"
> +		     :
> +		     : "c"(ecx), "d"(edx), "a"(eax));
> +}

Can we please get a comment stating from what binutils version this
opcode has a mnemonic? That way, when we raise the minimum binutils
version we can easily grep and find such things.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH v3 3/3] x86/delay: Introduce TPAUSE delay
  2020-04-14 10:31   ` Peter Zijlstra
@ 2020-04-14 18:00     ` Luck, Tony
  2020-04-14 18:05       ` Peter Zijlstra
  0 siblings, 1 reply; 10+ messages in thread
From: Luck, Tony @ 2020-04-14 18:00 UTC (permalink / raw)
  To: Peter Zijlstra, Park, Kyung Min
  Cc: x86, linux-kernel, tglx, mingo, hpa, gregkh, ak, Raj, Ashok,
	Shankar, Ravi V, Yu, Fenghua

>> +static inline void __tpause(u32 ecx, u32 edx, u32 eax)
>> +{
>> +	/* "tpause %ecx, %edx, %eax;" */
>> +	asm volatile(".byte 0x66, 0x0f, 0xae, 0xf1\t\n"
>> +		     :
>> +		     : "c"(ecx), "d"(edx), "a"(eax));
>> +}
>
> Can we please get a comment stating from what binutils version this
> opcode has a mnemonic? That way, when we raise the minimum binutils
> version we can easily grep and find such things.

Or maybe use arch/x86/Kconfig.assembler to set up a CONFIG_AS_TPAUSE?

Then the code can read something like (syntax may need fixing)

#ifdef CONFIG_AS_TPAUSE
		asm volatile("tpause %ecx\n", : : "c"(ecx), "d"(edx), "a"(eax));
#else
		asm volatile(".byte hex gibberish ...
#endif

-Tony

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 3/3] x86/delay: Introduce TPAUSE delay
  2020-04-14 18:00     ` Luck, Tony
@ 2020-04-14 18:05       ` Peter Zijlstra
  2020-04-15 19:57         ` km
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2020-04-14 18:05 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Park, Kyung Min, x86, linux-kernel, tglx, mingo, hpa, gregkh, ak,
	Raj, Ashok, Shankar, Ravi V, Yu, Fenghua

On Tue, Apr 14, 2020 at 06:00:21PM +0000, Luck, Tony wrote:
> >> +static inline void __tpause(u32 ecx, u32 edx, u32 eax)
> >> +{
> >> +	/* "tpause %ecx, %edx, %eax;" */
> >> +	asm volatile(".byte 0x66, 0x0f, 0xae, 0xf1\t\n"
> >> +		     :
> >> +		     : "c"(ecx), "d"(edx), "a"(eax));
> >> +}
> >
> > Can we please get a comment stating from what binutils version this
> > opcode has a mnemonic? That way, when we raise the minimum binutils
> > version we can easily grep and find such things.
> 
> Or maybe use arch/x86/Kconfig.assembler to set up a CONFIG_AS_TPAUSE?
> 
> Then the code can read something like (syntax may need fixing)
> 
> #ifdef CONFIG_AS_TPAUSE
> 		asm volatile("tpause %ecx\n", : : "c"(ecx), "d"(edx), "a"(eax));
> #else
> 		asm volatile(".byte hex gibberish ...
> #endif

Then we still need a comment to know when we can kill that...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 3/3] x86/delay: Introduce TPAUSE delay
  2020-04-14 18:05       ` Peter Zijlstra
@ 2020-04-15 19:57         ` km
  0 siblings, 0 replies; 10+ messages in thread
From: km @ 2020-04-15 19:57 UTC (permalink / raw)
  To: Peter Zijlstra, Luck, Tony
  Cc: x86, linux-kernel, tglx, mingo, hpa, gregkh, ak, Raj, Ashok,
	Shankar, Ravi V, Yu, Fenghua

Hi Peter,

On Tue, 2020-04-14 at 20:05 +0200, Peter Zijlstra wrote:
> On Tue, Apr 14, 2020 at 06:00:21PM +0000, Luck, Tony wrote:
> > > > +static inline void __tpause(u32 ecx, u32 edx, u32 eax)
> > > > +{
> > > > +	/* "tpause %ecx, %edx, %eax;" */
> > > > +	asm volatile(".byte 0x66, 0x0f, 0xae, 0xf1\t\n"
> > > > +		     :
> > > > +		     : "c"(ecx), "d"(edx), "a"(eax));
> > > > +}
> > > 
> > > Can we please get a comment stating from what binutils version
> > > this
> > > opcode has a mnemonic? That way, when we raise the minimum
> > > binutils
> > > version we can easily grep and find such things.
> > 
> > Or maybe use arch/x86/Kconfig.assembler to set up a
> > CONFIG_AS_TPAUSE?
> > 
> > Then the code can read something like (syntax may need fixing)
> > 
> > #ifdef CONFIG_AS_TPAUSE
> > 		asm volatile("tpause %ecx\n", : : "c"(ecx), "d"(edx),
> > "a"(eax));
> > #else
> > 		asm volatile(".byte hex gibberish ...
> > #endif
> 
> Then we still need a comment to know when we can kill that...

Thanks. I'll update in next patch.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-04-15 20:08 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-10 23:29 [PATCH v3 0/3] x86/delay: Introduce TPAUSE instruction Kyung Min Park
2020-04-10 23:29 ` [PATCH v3 1/3] x86/delay: Preparatory code cleanup Kyung Min Park
2020-04-13  3:26   ` kbuild test robot
2020-04-10 23:29 ` [PATCH v3 2/3] x86/delay: Refactor delay_mwaitx() for TPAUSE support Kyung Min Park
2020-04-10 23:29 ` [PATCH v3 3/3] x86/delay: Introduce TPAUSE delay Kyung Min Park
2020-04-13  5:42   ` kbuild test robot
2020-04-14 10:31   ` Peter Zijlstra
2020-04-14 18:00     ` Luck, Tony
2020-04-14 18:05       ` Peter Zijlstra
2020-04-15 19:57         ` km

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).