All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/2] x86, mwaitt: introduce AMD mwaitt support
@ 2015-06-15 10:48 Huang Rui
  2015-06-15 10:48 ` [PATCH v5 1/2] x86, mwaitt: add monitorx and mwaitx instruction Huang Rui
  2015-06-15 10:48 ` [PATCH v5 2/2] x86, mwaitt: introduce mwaix delay with a configurable timer Huang Rui
  0 siblings, 2 replies; 11+ messages in thread
From: Huang Rui @ 2015-06-15 10:48 UTC (permalink / raw)
  To: Borislav Petkov, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Ingo Molnar, Rafael J. Wysocki, Len Brown,
	John Stultz, Frédéric Weisbecker
  Cc: linux-kernel, x86, Fengguang Wu, Aaron Lu, Suravee Suthikulanit,
	Tony Li, Ken Xue, Huang Rui

Hi,

This patch set introduces a new instruction support on AMD Carrizo (Family
15h, Model 60h-6fh). It adds mwaitx delay function with a configurable
timer.

Andy and Boris provide a suggestion which use mwaitx on delay method.

Some discussions of the background, please see:
http://marc.info/?l=linux-kernel&m=143202042530498&w=2
http://marc.info/?l=linux-kernel&m=143161327003541&w=2
http://marc.info/?l=linux-kernel&m=143222815331016&w=2

They are rebased on tip/master.

Changes from v1 -> v2
- Remove mwaitx idle implementation since some disputes without power
  improvement.
- Add a patch which implement another use case on delay.
- Introduce a kernel parameter (delay) to make delay method configurable.

Changes from v2 -> v3
- Add compared data on commit message
- Remove kernel parameter
- Add hint to avoid to access deep state in future
- Update mwaitx delay method as Petter's suggestion

Changes from v3 -> v4
- Put the MONITORX/MWAITX description into comments

Changes from v4 -> v5
- Remove mwaitx function
- Use mwaitx_delay at init_amd
- Use cpu_tts as montioring address scope

I already do some testing with mwaitx on udelay.

Test scenario:

glb_loops = usec_to_tsc(delay_usec)
rdtsc -> read TSC counters as start
mwaitx_delay(glb_loops)
rdtsc -> read TSC counters as end
diff = end - start

Compared the real TSC counts (diff), that means the loops of counter goes.
And glb_loops is the input value of the EBX, that is the expect loops which
user configures. Below is 10000 us of mwaitx delay, we could see about 1200
(diff - glb_loops) delayed with mwaitx.

[ 2369.008651] start=4401974718758, end=4401992576888, diff=17858130, glb_loops=17856939

Thanks,
Rui

Huang Rui (2):
  x86, mwaitt: add monitorx and mwaitx instruction
  x86, mwaitt: introduce mwaix delay with a configurable timer

 arch/x86/include/asm/cpufeature.h |  1 +
 arch/x86/include/asm/delay.h      |  1 +
 arch/x86/include/asm/mwait.h      | 43 +++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/amd.c         |  4 ++++
 arch/x86/lib/delay.c              | 45 +++++++++++++++++++++++++++++++++++++++
 5 files changed, 94 insertions(+)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v5 1/2] x86, mwaitt: add monitorx and mwaitx instruction
  2015-06-15 10:48 [PATCH v5 0/2] x86, mwaitt: introduce AMD mwaitt support Huang Rui
@ 2015-06-15 10:48 ` Huang Rui
  2015-06-15 10:55   ` Peter Zijlstra
  2015-06-15 10:48 ` [PATCH v5 2/2] x86, mwaitt: introduce mwaix delay with a configurable timer Huang Rui
  1 sibling, 1 reply; 11+ messages in thread
From: Huang Rui @ 2015-06-15 10:48 UTC (permalink / raw)
  To: Borislav Petkov, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Ingo Molnar, Rafael J. Wysocki, Len Brown,
	John Stultz, Frédéric Weisbecker
  Cc: linux-kernel, x86, Fengguang Wu, Aaron Lu, Suravee Suthikulanit,
	Tony Li, Ken Xue, Huang Rui

On AMD Carrizo processors (Family 15h, Model 60h-6fh), there is a new
feature called MWAITT (Mwait with a timer) as an extension of
Monitor/Mwait.

MWAITT, another name is MWAITX (MWAIT with extensions), has a configurable
timer that causes MWAITX to exit on expiration.

Compared with MONITOR/MWAIT, there are minor differences in opcode and
input parameters.

MWAITX ECX[1]: enable timer if set
MWAITX EBX[31:0]: max wait time expressed in SW P0 clocks

                MWAIT                           MWAITX
opcode          0f 01 c9           |            0f 01 fb
ECX[0]                  value of RFLAGS.IF seen by instruction
ECX[1]          unused/#GP if set  |            enable timer if set
ECX[31:2]                     unused/#GP if set
EAX                           unused (reserve for hint)
EBX[31:0]       unused             |            max wait time (loops)

                MONITOR                         MONITORX
opcode          0f 01 c8           |            0f 01 fa
EAX                     (logical) address to monitor
ECX                     #GP if not zero

The software P0 frequency is the same as the TSC frequency.

Max timeout = EBX/(TSC frequency)

Signed-off-by: Huang Rui <ray.huang@amd.com>
---
 arch/x86/include/asm/cpufeature.h |  1 +
 arch/x86/include/asm/mwait.h      | 40 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 3d6606f..3ef1f6e 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -176,6 +176,7 @@
 #define X86_FEATURE_PERFCTR_NB  ( 6*32+24) /* NB performance counter extensions */
 #define X86_FEATURE_BPEXT	(6*32+26) /* data breakpoint extension */
 #define X86_FEATURE_PERFCTR_L2	( 6*32+28) /* L2 performance counter extensions */
+#define X86_FEATURE_MWAITT	( 6*32+29) /* Mwait extension (MonitorX/MwaitX) */
 
 /*
  * Auxiliary flags: Linux defined - For features scattered in various
diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index 653dfa7..1fbc89d 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -23,6 +23,14 @@ static inline void __monitor(const void *eax, unsigned long ecx,
 		     :: "a" (eax), "c" (ecx), "d"(edx));
 }
 
+static inline void __monitorx(const void *eax, unsigned long ecx,
+			      unsigned long edx)
+{
+	/* "monitorx %eax, %ecx, %edx;" */
+	asm volatile(".byte 0x0f, 0x01, 0xfa;"
+		     :: "a" (eax), "c" (ecx), "d"(edx));
+}
+
 static inline void __mwait(unsigned long eax, unsigned long ecx)
 {
 	/* "mwait %eax, %ecx;" */
@@ -30,6 +38,38 @@ static inline void __mwait(unsigned long eax, unsigned long ecx)
 		     :: "a" (eax), "c" (ecx));
 }
 
+/*
+ * MWAITT allows for both a timer value to get you out of the MWAIT as
+ * well as the normal exit conditions.
+ *
+ * MWAITX ECX[1]: enable timer if set
+ * MWAITX EBX[31:0]: max wait time expressed in SW P0 clocks
+ *
+ * Below is the compared data between MWAIT and MWAITX on AMD
+ * processors:
+ *                 MWAIT                           MWAITX
+ * opcode          0f 01 c9           |            0f 01 fb
+ * ECX[0]                  value of RFLAGS.IF seen by instruction
+ * ECX[1]          unused/#GP if set  |            enable timer if set
+ * ECX[31:2]                     unused/#GP if set
+ * EAX                           unused (reserve for hint)
+ * EBX[31:0]       unused             |            max wait time (loops)
+ *
+ *                 MONITOR                         MONITORX
+ * opcode          0f 01 c8           |            0f 01 fa
+ * EAX                     (logical) address to monitor
+ * ECX                     #GP if not zero
+ *
+ * The software P0 frequency is the same as the TSC frequency.
+ */
+static inline void __mwaitx(unsigned long eax, unsigned long ebx,
+			    unsigned long ecx)
+{
+	/* "mwaitx %eax, %ebx, %ecx;" */
+	asm volatile(".byte 0x0f, 0x01, 0xfb;"
+		     :: "a" (eax), "b" (ebx), "c" (ecx));
+}
+
 static inline void __sti_mwait(unsigned long eax, unsigned long ecx)
 {
 	trace_hardirqs_on();
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v5 2/2] x86, mwaitt: introduce mwaix delay with a configurable timer
  2015-06-15 10:48 [PATCH v5 0/2] x86, mwaitt: introduce AMD mwaitt support Huang Rui
  2015-06-15 10:48 ` [PATCH v5 1/2] x86, mwaitt: add monitorx and mwaitx instruction Huang Rui
@ 2015-06-15 10:48 ` Huang Rui
  2015-06-15 10:57   ` Peter Zijlstra
  1 sibling, 1 reply; 11+ messages in thread
From: Huang Rui @ 2015-06-15 10:48 UTC (permalink / raw)
  To: Borislav Petkov, Andy Lutomirski, Thomas Gleixner,
	Peter Zijlstra, Ingo Molnar, Rafael J. Wysocki, Len Brown,
	John Stultz, Frédéric Weisbecker
  Cc: linux-kernel, x86, Fengguang Wu, Aaron Lu, Suravee Suthikulanit,
	Tony Li, Ken Xue, Huang Rui

MWAITX can enable a timer and a corresponding timer value specified in SW
P0 clocks. The SW P0 frequency is the same with TSC. The timer provides an
upper bound on how long the instruction waits before exiting.

The implementation of delay function in kernel can lerverage the timer of
MWAITX. This patch provides a new method (delay_mwaitx) to measure delay
time.

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Suggested-by: Borislav Petkov <bp@suse.de>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Huang Rui <ray.huang@amd.com>
---
 arch/x86/include/asm/delay.h |  1 +
 arch/x86/include/asm/mwait.h |  3 +++
 arch/x86/kernel/cpu/amd.c    |  4 ++++
 arch/x86/lib/delay.c         | 45 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 53 insertions(+)

diff --git a/arch/x86/include/asm/delay.h b/arch/x86/include/asm/delay.h
index 9b3b4f2..36a760b 100644
--- a/arch/x86/include/asm/delay.h
+++ b/arch/x86/include/asm/delay.h
@@ -4,5 +4,6 @@
 #include <asm-generic/delay.h>
 
 void use_tsc_delay(void);
+void use_mwaitx_delay(void);
 
 #endif /* _ASM_X86_DELAY_H */
diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index 1fbc89d..47f3540 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -14,6 +14,9 @@
 #define CPUID5_ECX_INTERRUPT_BREAK	0x2
 
 #define MWAIT_ECX_INTERRUPT_BREAK	0x1
+#define MWAITX_ECX_TIMER_ENABLE		BIT(1)
+#define MWAITX_MAX_LOOPS		((u32)-1)
+#define MWAITX_DISABLE_CSTATES		0xf
 
 static inline void __monitor(const void *eax, unsigned long ecx,
 			     unsigned long edx)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 5bd3a99..1f0a8e2 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -11,6 +11,7 @@
 #include <asm/cpu.h>
 #include <asm/smp.h>
 #include <asm/pci-direct.h>
+#include <asm/delay.h>
 
 #ifdef CONFIG_X86_64
 # include <asm/mmconfig.h>
@@ -661,6 +662,9 @@ static void init_amd(struct cpuinfo_x86 *c)
 
 	early_init_amd(c);
 
+	if (static_cpu_has_safe(X86_FEATURE_MWAITT))
+		use_mwaitx_delay();
+
 	/*
 	 * Bit 31 in normal CPUID used for nonstandard 3DNow ID;
 	 * 3DNow is IDd by bit 31 in extended CPUID (1*32+31) anyway
diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index 39d6a3d..035b6f6 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -20,6 +20,7 @@
 #include <asm/processor.h>
 #include <asm/delay.h>
 #include <asm/timer.h>
+#include <asm/mwait.h>
 
 #ifdef CONFIG_SMP
 # include <asm/smp.h>
@@ -87,6 +88,45 @@ static void delay_tsc(unsigned long __loops)
 }
 
 /*
+ * On AMD platforms mwaitx has a configurable 32-bit timer, that counts
+ * with TSC frequency. And the input value is the loop of the counter, it
+ * will exit with the timer expired.
+ */
+static void delay_mwaitx(unsigned long __loops)
+{
+	u32 end, start, delay, loops = __loops;
+
+	rdtsc_barrier();
+	rdtscl(start);
+
+	for (;;) {
+		delay = min(MWAITX_MAX_LOOPS, loops);
+
+		/*
+		 * Use cpu_tss as a cacheline-aligned, seldomly
+		 * accessed per-cpu variable as the monitor target.
+		 */
+		__monitorx(this_cpu_ptr(&cpu_tss), 0, 0);
+		/*
+		 * AMD, like Intel, supports the EAX hint and EAX=0xf
+		 * means, do not enter any deep C-state and we use it
+		 * here in delay() to minimize wakeup latency.
+		 */
+		__mwaitx(MWAITX_DISABLE_CSTATES, delay, MWAITX_ECX_TIMER_ENABLE);
+
+		rdtsc_barrier();
+		rdtscl(end);
+
+		if (loops <= end - start)
+			break;
+
+		loops -= end - start;
+
+		start = end;
+	}
+}
+
+/*
  * Since we calibrate only once at boot, this
  * function should be set once at boot and not changed
  */
@@ -97,6 +137,11 @@ void use_tsc_delay(void)
 	delay_fn = delay_tsc;
 }
 
+void use_mwaitx_delay(void)
+{
+	delay_fn = delay_mwaitx;
+}
+
 int read_current_timer(unsigned long *timer_val)
 {
 	if (delay_fn == delay_tsc) {
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 1/2] x86, mwaitt: add monitorx and mwaitx instruction
  2015-06-15 10:48 ` [PATCH v5 1/2] x86, mwaitt: add monitorx and mwaitx instruction Huang Rui
@ 2015-06-15 10:55   ` Peter Zijlstra
  2015-06-15 12:42     ` Huang Rui
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2015-06-15 10:55 UTC (permalink / raw)
  To: Huang Rui
  Cc: Borislav Petkov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Rafael J. Wysocki, Len Brown, John Stultz,
	Frédéric Weisbecker, linux-kernel, x86, Fengguang Wu,
	Aaron Lu, Suravee Suthikulanit, Tony Li, Ken Xue

On Mon, Jun 15, 2015 at 06:48:03PM +0800, Huang Rui wrote:
> +/*
> + * MWAITT allows for both a timer value to get you out of the MWAIT as
> + * well as the normal exit conditions.
> + *
> + * MWAITX ECX[1]: enable timer if set
> + * MWAITX EBX[31:0]: max wait time expressed in SW P0 clocks
> + *
> + * Below is the compared data between MWAIT and MWAITX on AMD
> + * processors:
> + *                 MWAIT                           MWAITX
> + * opcode          0f 01 c9           |            0f 01 fb
> + * ECX[0]                  value of RFLAGS.IF seen by instruction
> + * ECX[1]          unused/#GP if set  |            enable timer if set
> + * ECX[31:2]                     unused/#GP if set
> + * EAX                           unused (reserve for hint)

Seeing how you're stuffing a !0 value in here in the next patch, the
above comment seems slightly incorrect, no?

> + * EBX[31:0]       unused             |            max wait time (loops)
> + *
> + *                 MONITOR                         MONITORX
> + * opcode          0f 01 c8           |            0f 01 fa
> + * EAX                     (logical) address to monitor
> + * ECX                     #GP if not zero
> + *
> + * The software P0 frequency is the same as the TSC frequency.
> + */
> +static inline void __mwaitx(unsigned long eax, unsigned long ebx,
> +			    unsigned long ecx)
> +{
> +	/* "mwaitx %eax, %ebx, %ecx;" */
> +	asm volatile(".byte 0x0f, 0x01, 0xfb;"
> +		     :: "a" (eax), "b" (ebx), "c" (ecx));
> +}
> +
>  static inline void __sti_mwait(unsigned long eax, unsigned long ecx)
>  {
>  	trace_hardirqs_on();
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 2/2] x86, mwaitt: introduce mwaix delay with a configurable timer
  2015-06-15 10:48 ` [PATCH v5 2/2] x86, mwaitt: introduce mwaix delay with a configurable timer Huang Rui
@ 2015-06-15 10:57   ` Peter Zijlstra
  2015-06-15 11:15     ` Borislav Petkov
  2015-06-15 14:14     ` Huang Rui
  0 siblings, 2 replies; 11+ messages in thread
From: Peter Zijlstra @ 2015-06-15 10:57 UTC (permalink / raw)
  To: Huang Rui
  Cc: Borislav Petkov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Rafael J. Wysocki, Len Brown, John Stultz,
	Frédéric Weisbecker, linux-kernel, x86, Fengguang Wu,
	Aaron Lu, Suravee Suthikulanit, Tony Li, Ken Xue

On Mon, Jun 15, 2015 at 06:48:04PM +0800, Huang Rui wrote:
> diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
> index 1fbc89d..47f3540 100644
> --- a/arch/x86/include/asm/mwait.h
> +++ b/arch/x86/include/asm/mwait.h
> @@ -14,6 +14,9 @@
>  #define CPUID5_ECX_INTERRUPT_BREAK	0x2
>  
>  #define MWAIT_ECX_INTERRUPT_BREAK	0x1
> +#define MWAITX_ECX_TIMER_ENABLE		BIT(1)
> +#define MWAITX_MAX_LOOPS		((u32)-1)
> +#define MWAITX_DISABLE_CSTATES		0xf
>  
>  static inline void __monitor(const void *eax, unsigned long ecx,
>  			     unsigned long edx)

Should this hunk not be part of the previous patch?

>  /*
> + * On AMD platforms mwaitx has a configurable 32-bit timer, that counts
> + * with TSC frequency. And the input value is the loop of the counter, it
> + * will exit with the timer expired.
> + */
> +static void delay_mwaitx(unsigned long __loops)
> +{
> +	u32 end, start, delay, loops = __loops;
> +
> +	rdtsc_barrier();
> +	rdtscl(start);
> +
> +	for (;;) {
> +		delay = min(MWAITX_MAX_LOOPS, loops);
> +
> +		/*
> +		 * Use cpu_tss as a cacheline-aligned, seldomly
> +		 * accessed per-cpu variable as the monitor target.
> +		 */
> +		__monitorx(this_cpu_ptr(&cpu_tss), 0, 0);
> +		/*
> +		 * AMD, like Intel, supports the EAX hint and EAX=0xf
> +		 * means, do not enter any deep C-state and we use it
> +		 * here in delay() to minimize wakeup latency.
> +		 */
> +		__mwaitx(MWAITX_DISABLE_CSTATES, delay, MWAITX_ECX_TIMER_ENABLE);
> +
> +		rdtsc_barrier();
> +		rdtscl(end);
> +
> +		if (loops <= end - start)
> +			break;
> +
> +		loops -= end - start;
> +
> +		start = end;
> +	}
> +}

OK, so what is not explained is how this delay is 'better' than the
TSC delay loop we currently have.

Seeing how we disable C states, its unlikely to use less energy, so what
exactly is its benefit, other than using fancy new instructions?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 2/2] x86, mwaitt: introduce mwaix delay with a configurable timer
  2015-06-15 10:57   ` Peter Zijlstra
@ 2015-06-15 11:15     ` Borislav Petkov
  2015-06-15 14:04       ` Huang Rui
  2015-06-15 14:14     ` Huang Rui
  1 sibling, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2015-06-15 11:15 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Huang Rui, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Rafael J. Wysocki, Len Brown, John Stultz,
	Frédéric Weisbecker, linux-kernel, x86, Fengguang Wu,
	Aaron Lu, Suravee Suthikulanit, Tony Li, Ken Xue

On Mon, Jun 15, 2015 at 12:57:18PM +0200, Peter Zijlstra wrote:
> Seeing how we disable C states, its unlikely to use less energy, so what
> exactly is its benefit, other than using fancy new instructions?

If the "disabling of C states" turns into a "go into a C-state which has
small wakeup latency" in the future, would be cool. Because we have a
bunch of places where we do delay.

Rui, what happens if you enter C1 instead? Any improvements or is the
wakeup latency too high?

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 1/2] x86, mwaitt: add monitorx and mwaitx instruction
  2015-06-15 10:55   ` Peter Zijlstra
@ 2015-06-15 12:42     ` Huang Rui
  0 siblings, 0 replies; 11+ messages in thread
From: Huang Rui @ 2015-06-15 12:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Borislav Petkov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Rafael J. Wysocki, Len Brown, John Stultz,
	Frédéric Weisbecker, linux-kernel, x86, Fengguang Wu,
	Aaron Lu, Suravee Suthikulanit, Tony Li, Ken Xue

On Mon, Jun 15, 2015 at 12:55:42PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 15, 2015 at 06:48:03PM +0800, Huang Rui wrote:
> > +/*
> > + * MWAITT allows for both a timer value to get you out of the MWAIT as
> > + * well as the normal exit conditions.
> > + *
> > + * MWAITX ECX[1]: enable timer if set
> > + * MWAITX EBX[31:0]: max wait time expressed in SW P0 clocks
> > + *
> > + * Below is the compared data between MWAIT and MWAITX on AMD
> > + * processors:
> > + *                 MWAIT                           MWAITX
> > + * opcode          0f 01 c9           |            0f 01 fb
> > + * ECX[0]                  value of RFLAGS.IF seen by instruction
> > + * ECX[1]          unused/#GP if set  |            enable timer if set
> > + * ECX[31:2]                     unused/#GP if set
> > + * EAX                           unused (reserve for hint)
> 
> Seeing how you're stuffing a !0 value in here in the next patch, the
> above comment seems slightly incorrect, no?
> 

That's because current processor doesn't support to go C1 state.
The next patch use the hint is to avoid core go to C1 state in future
processor.

Thanks,
Rui

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 2/2] x86, mwaitt: introduce mwaix delay with a configurable timer
  2015-06-15 11:15     ` Borislav Petkov
@ 2015-06-15 14:04       ` Huang Rui
  2015-06-15 14:25         ` Borislav Petkov
  0 siblings, 1 reply; 11+ messages in thread
From: Huang Rui @ 2015-06-15 14:04 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Peter Zijlstra, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Rafael J. Wysocki, Len Brown, John Stultz,
	Frédéric Weisbecker, linux-kernel, x86, Fengguang Wu,
	Aaron Lu, Suravee Suthikulanit, Tony Li, Ken Xue

On Mon, Jun 15, 2015 at 01:15:26PM +0200, Borislav Petkov wrote:
> On Mon, Jun 15, 2015 at 12:57:18PM +0200, Peter Zijlstra wrote:
> > Seeing how we disable C states, its unlikely to use less energy, so what
> > exactly is its benefit, other than using fancy new instructions?
> 
> If the "disabling of C states" turns into a "go into a C-state which has
> small wakeup latency" in the future, would be cool. Because we have a
> bunch of places where we do delay.
> 
> Rui, what happens if you enter C1 instead? Any improvements or is the
> wakeup latency too high?
> 

Hmm, the current processor cannot enter C1 with MWAITX, so I don't
confirm if it would have higher wakeup latency on C1 in future.

Thanks,
Rui

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 2/2] x86, mwaitt: introduce mwaix delay with a configurable timer
  2015-06-15 10:57   ` Peter Zijlstra
  2015-06-15 11:15     ` Borislav Petkov
@ 2015-06-15 14:14     ` Huang Rui
  1 sibling, 0 replies; 11+ messages in thread
From: Huang Rui @ 2015-06-15 14:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Borislav Petkov, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Rafael J. Wysocki, Len Brown, John Stultz,
	Frédéric Weisbecker, linux-kernel, x86, Fengguang Wu,
	Aaron Lu, Suravee Suthikulanit, Tony Li, Ken Xue

On Mon, Jun 15, 2015 at 12:57:18PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 15, 2015 at 06:48:04PM +0800, Huang Rui wrote:
> > diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
> > index 1fbc89d..47f3540 100644
> > --- a/arch/x86/include/asm/mwait.h
> > +++ b/arch/x86/include/asm/mwait.h
> > @@ -14,6 +14,9 @@
> >  #define CPUID5_ECX_INTERRUPT_BREAK	0x2
> >  
> >  #define MWAIT_ECX_INTERRUPT_BREAK	0x1
> > +#define MWAITX_ECX_TIMER_ENABLE		BIT(1)
> > +#define MWAITX_MAX_LOOPS		((u32)-1)
> > +#define MWAITX_DISABLE_CSTATES		0xf
> >  
> >  static inline void __monitor(const void *eax, unsigned long ecx,
> >  			     unsigned long edx)
> 
> Should this hunk not be part of the previous patch?
> 

This definitions are used to implement delay function, so I put it at
this patch. :)

Thanks,
Rui

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 2/2] x86, mwaitt: introduce mwaix delay with a configurable timer
  2015-06-15 14:04       ` Huang Rui
@ 2015-06-15 14:25         ` Borislav Petkov
  2015-06-15 14:33           ` Huang Rui
  0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2015-06-15 14:25 UTC (permalink / raw)
  To: Huang Rui
  Cc: Peter Zijlstra, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Rafael J. Wysocki, Len Brown, John Stultz,
	Frédéric Weisbecker, linux-kernel, x86, Fengguang Wu,
	Aaron Lu, Suravee Suthikulanit, Tony Li, Ken Xue

On Mon, Jun 15, 2015 at 10:04:19PM +0800, Huang Rui wrote:
> Hmm, the current processor cannot enter C1 with MWAITX, so I don't
> confirm if it would have higher wakeup latency on C1 in future.

I remember you saying MWAITX enters currently something between C1 and
C0. How does that behave wrt latency and power savings?

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 2/2] x86, mwaitt: introduce mwaix delay with a configurable timer
  2015-06-15 14:25         ` Borislav Petkov
@ 2015-06-15 14:33           ` Huang Rui
  0 siblings, 0 replies; 11+ messages in thread
From: Huang Rui @ 2015-06-15 14:33 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Peter Zijlstra, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Rafael J. Wysocki, Len Brown, John Stultz,
	Frédéric Weisbecker, linux-kernel, x86, Fengguang Wu,
	Aaron Lu, Suravee Suthikulanit, Tony Li, Ken Xue

On Mon, Jun 15, 2015 at 04:25:59PM +0200, Borislav Petkov wrote:
> On Mon, Jun 15, 2015 at 10:04:19PM +0800, Huang Rui wrote:
> > Hmm, the current processor cannot enter C1 with MWAITX, so I don't
> > confirm if it would have higher wakeup latency on C1 in future.
> 
> I remember you saying MWAITX enters currently something between C1 and
> C0. How does that behave wrt latency and power savings?
> 

Yes, right. The power consumption is less than C0 (C0 > MWAITX > C1).
The waiting exit speed is faster than HTL.

Thanks,
Rui

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-06-15 14:34 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-15 10:48 [PATCH v5 0/2] x86, mwaitt: introduce AMD mwaitt support Huang Rui
2015-06-15 10:48 ` [PATCH v5 1/2] x86, mwaitt: add monitorx and mwaitx instruction Huang Rui
2015-06-15 10:55   ` Peter Zijlstra
2015-06-15 12:42     ` Huang Rui
2015-06-15 10:48 ` [PATCH v5 2/2] x86, mwaitt: introduce mwaix delay with a configurable timer Huang Rui
2015-06-15 10:57   ` Peter Zijlstra
2015-06-15 11:15     ` Borislav Petkov
2015-06-15 14:04       ` Huang Rui
2015-06-15 14:25         ` Borislav Petkov
2015-06-15 14:33           ` Huang Rui
2015-06-15 14:14     ` Huang Rui

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.