From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755145AbbFOK5h (ORCPT ); Mon, 15 Jun 2015 06:57:37 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:43152 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753971AbbFOK53 (ORCPT ); Mon, 15 Jun 2015 06:57:29 -0400 Date: Mon, 15 Jun 2015 12:57:18 +0200 From: Peter Zijlstra To: Huang Rui Cc: Borislav Petkov , Andy Lutomirski , Thomas Gleixner , Ingo Molnar , "Rafael J. Wysocki" , Len Brown , John Stultz , =?iso-8859-1?Q?Fr=E9d=E9ric?= Weisbecker , linux-kernel@vger.kernel.org, x86@kernel.org, Fengguang Wu , Aaron Lu , Suravee Suthikulanit , Tony Li , Ken Xue Subject: Re: [PATCH v5 2/2] x86, mwaitt: introduce mwaix delay with a configurable timer Message-ID: <20150615105718.GW3644@twins.programming.kicks-ass.net> References: <1434365284-1495-1-git-send-email-ray.huang@amd.com> <1434365284-1495-3-git-send-email-ray.huang@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1434365284-1495-3-git-send-email-ray.huang@amd.com> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 15, 2015 at 06:48:04PM +0800, Huang Rui wrote: > diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h > index 1fbc89d..47f3540 100644 > --- a/arch/x86/include/asm/mwait.h > +++ b/arch/x86/include/asm/mwait.h > @@ -14,6 +14,9 @@ > #define CPUID5_ECX_INTERRUPT_BREAK 0x2 > > #define MWAIT_ECX_INTERRUPT_BREAK 0x1 > +#define MWAITX_ECX_TIMER_ENABLE BIT(1) > +#define MWAITX_MAX_LOOPS ((u32)-1) > +#define MWAITX_DISABLE_CSTATES 0xf > > static inline void __monitor(const void *eax, unsigned long ecx, > unsigned long edx) Should this hunk not be part of the previous patch? > /* > + * On AMD platforms mwaitx has a configurable 32-bit timer, that counts > + * with TSC frequency. And the input value is the loop of the counter, it > + * will exit with the timer expired. > + */ > +static void delay_mwaitx(unsigned long __loops) > +{ > + u32 end, start, delay, loops = __loops; > + > + rdtsc_barrier(); > + rdtscl(start); > + > + for (;;) { > + delay = min(MWAITX_MAX_LOOPS, loops); > + > + /* > + * Use cpu_tss as a cacheline-aligned, seldomly > + * accessed per-cpu variable as the monitor target. > + */ > + __monitorx(this_cpu_ptr(&cpu_tss), 0, 0); > + /* > + * AMD, like Intel, supports the EAX hint and EAX=0xf > + * means, do not enter any deep C-state and we use it > + * here in delay() to minimize wakeup latency. > + */ > + __mwaitx(MWAITX_DISABLE_CSTATES, delay, MWAITX_ECX_TIMER_ENABLE); > + > + rdtsc_barrier(); > + rdtscl(end); > + > + if (loops <= end - start) > + break; > + > + loops -= end - start; > + > + start = end; > + } > +} OK, so what is not explained is how this delay is 'better' than the TSC delay loop we currently have. Seeing how we disable C states, its unlikely to use less energy, so what exactly is its benefit, other than using fancy new instructions?