From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754581AbbFOKtw (ORCPT ); Mon, 15 Jun 2015 06:49:52 -0400 Received: from mail-by2on0110.outbound.protection.outlook.com ([207.46.100.110]:1856 "EHLO na01-by2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754286AbbFOKtb (ORCPT ); Mon, 15 Jun 2015 06:49:31 -0400 Authentication-Results: spf=none (sender IP is 165.204.84.222) smtp.mailfrom=amd.com; amacapital.net; dkim=none (message not signed) header.d=none; X-WSS-ID: 0NPZFEE-08-W02-02 X-M-MSG: From: Huang Rui To: Borislav Petkov , Andy Lutomirski , Thomas Gleixner , Peter Zijlstra , Ingo Molnar , "Rafael J. Wysocki" , "Len Brown" , John Stultz , =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Weisbecker?= CC: , , Fengguang Wu , Aaron Lu , Suravee Suthikulanit , Tony Li , Ken Xue , Huang Rui Subject: [PATCH v5 2/2] x86, mwaitt: introduce mwaix delay with a configurable timer Date: Mon, 15 Jun 2015 18:48:04 +0800 Message-ID: <1434365284-1495-3-git-send-email-ray.huang@amd.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1434365284-1495-1-git-send-email-ray.huang@amd.com> References: <1434365284-1495-1-git-send-email-ray.huang@amd.com> MIME-Version: 1.0 Content-Type: text/plain X-EOPAttributedMessage: 0 X-Microsoft-Exchange-Diagnostics: 1;BN1AFFO11FD042;1:OgaqGIFfwcYraTBxOonRYxeGOJOsVjoJ6qWb0mtvsUUWPjrPi7plYojtpC0OQWlhpAoXsNNM+kTdKEyc7Qr8XwP3QGcwoKIDSY3Q+fCCCpj1oZ+tsM3lvp+Prz0eY7MT7hiTAGQyv7+zD8jJwhFIKoktTEsVxFoDbuwJwMq7Y8uvblTvp7sv4Hlo68D6yYUuO1DUeDOdHEyTKpq/QQ3vQtd4j+HsE5B57PSS0X4VRwei8vy2T5GAZngnkxQHa8w8/Mtcmez8njXCKB5U4xO1MAPjczeh+hpA7qVnNfph65yJJNGXrqCUiOu5COfjv2zX1qmoQUTWkg6asfmlzTtvRV0yiNX6hmIZy1KDP7obkVE= X-Forefront-Antispam-Report: CIP:165.204.84.222;CTRY:US;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(10019020)(6009001)(428002)(199003)(189002)(50226001)(106466001)(33646002)(5003600100002)(229853001)(48376002)(50466002)(53416004)(76176999)(5001770100001)(36756003)(101416001)(2950100001)(189998001)(46102003)(62966003)(77156002)(92566002)(47776003)(575784001)(86362001)(105586002)(50986999)(77096005)(19580405001)(19580395003)(87936001);DIR:OUT;SFP:1102;SCL:1;SRVR:BN1PR02MB070;H:atltwp02.amd.com;FPR:;SPF:None;MLV:sfv;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: 1;BN1PR02MB070;2:C4CJC8iynrz8/qkGKk3kNJ5F3SJs5TH7tCUfY/cTPa9UcHvg8XL5yaibhru2PXJe;2:FY0R9TyqAjfvd7WpfA1XvHcG5cH1sc7Mb5D+LA7KSox3Oq4qXnxNxVxyHzS7KSeOSPkVP0hzSDtAiY20mIaj2FBF+OJCLmvZK3/01N0L04ls9uoU3rRsq9mQOSvhCLnxTOYpSnUZedpElMBnkyaokkuiK2J6O+Kzgdo65g2wSGtytpTJNdg4hWMsMwTEdvubNH3aygyjjfIq0huyophYYYzPN7yJj1aiv2dPP566ntnRaIqqrorUJROKv+4RMcGg;6:5snNu3bmjMMw9oq88L2ajN/RUeOZqgj6TRK3YhZN0YrnLlV2aNKpaljamWzQFz+oDAo7J8xGN1lPBCAsQbl0bq0YyrhFLyQmJFQhzNjDiXAInZV6z0N5wgmwGJV7kAKLMiQoBW3ioGC0S+j7zX54lz7/JNrYnD4rVXiyEnoRq5foI4udm9wvM7UexQ0NPKOUrzCefrfnR/1J63T6m8xRSWuT1uusA2HsUV0trD6ks0Tyem3tccr4SBAPlk9kPeit0enrjkNExnnVMJfX9YIgc4f+4j/4J2NT7r4d/S3hQEs5LgHxWQ3DwNgzneYhXZOlQGYkXdPINTzdCC8VkPNGvwEwXJ9bYLfQotUajVluTxBpIPu9kMyHmw7VUEGmAtUAKpgSbGFdBNvgqaL6voTBlK0JmEndNhfuKvGUIw6BG1qci3zIUmgtkTd8I1CjkRU7kMXYaSb22wihihMgGeqLYmSPiJblu1jzu7Cdz4xdEVi//BTL97Up0paRe/0DSzVj X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BN1PR02MB070; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(520003)(5005006)(3002001);SRVR:BN1PR02MB070;BCL:0;PCL:0;RULEID:;SRVR:BN1PR02MB070; X-Microsoft-Exchange-Diagnostics: 1;BN1PR02MB070;3:YjWzCRO82yGGS5X3Cq52zK570iop7jCvsxBCmoLzMEaE5QK8d4anHXcTO2/naQQWHEAaQE2ggiTxFT/9xVxQyb3BAhUQ/Lu3m7aAM0ufrFaxVd0srE62tVzyoj48P3nAxGA4nqetuggVZeSZUcf1mZ/F1ISO/LUP3tqOi9/cmxUZZa04u/d/sD889HLJwhoOIE6fdZgBGlwWR8f+LnWYfgETsII0J4Mv76n4zXA2vrp8z2m8H/55eHnMRXO8WSdL/rGhcIMpZQWP2QhXN08XyZkcHVW/PtsHdJ0K0FMtRwlm1iaf4yjpigx7UAoi9ab0 X-Forefront-PRVS: 0608DEDB67 X-Microsoft-Exchange-Diagnostics: 1;BN1PR02MB070;9:Ug+maTYqcni8llLNarAd464A0fGRPNlNvdC0CsBqGLudnvFDnkhTGWmpIR4ovW+n3IUng4cj5AbH+PUAQBb+zzq4Pb70EaGNbjTcy7pEGfr8je9bAvSK4B8vRYmTCjTOGos6dDmM4yGwNzGimrR3lsPCsSD5DM96Eis67/FHvljEdmx7pk77oGpcHokE4lfj6m97/RHDTLYLHkHWtup/YQKQZXBE36Z6vPmha3o6TygR+NWGBWuaKAdQ4Hdp5xYi6eDECZwjS3cM/hSf1g5ieGZYxcMG9JsVWk082IPycyYfwhtRtMtgjeYVizlsNgxRzHkctJCyIIf5PenW7OePBhr3+KfMfijCfd7XPMomv8db5nxcgWt70uDfXnSCHGDO7AiX9RPPYK7qTAyzMtNDMzXpmAXWjjZVHFaFEQ6S9JjvI+FrzYP/nZJ//tHZq8VxUviTY/U6Sp47Z2eSmDmN8jG7KqbnZFRRHB0LorLPuE8zx2nx6xHYHtpKohk/GvNDVG5Q5dp29qeFgnbp1qrdxB4Mu3ilNsKTobmbAn7AyqVE0YpNoLk2sCMtdr7ABLwqxHiiDWBX4zezTITojsIIl0Rxi1CGZd/hTgwzhBxqDsUaIq04XUfP1z02Kq98V67fTZshwAr7v2IEdUsa3+/v2KjhFdIkKFjzXp65uTj5BRUPoCB54Xmdo9kZsPEn6+iXzlgTkPkrqp7YcuC0hDOQG4ciQP8hRZF3qEHzQdC1NQeX7wbUlIsm0O/PlwmBcc+EAru7hC9j9In0VmSI7FLvCuJitD91J0bykWm20hWfLsqxzGuTJylMW69iNyutuXswh17l40P0jAy+cdh9ABRgHg== X-Microsoft-Exchange-Diagnostics: 1;BN1PR02MB070;3:ZoIs+MxnO4qrGxoQ0qYhp9YwyN1q0Nsx9/0/vm8b4YiCfqX2At+iigUSlHkN7auJthhfDEpL07/PoZK/tXdTsXLpdrJ2kDlGE6dxtQDdvtZmOVig1NEU5Nq+LvY2yP2kpNcD5QJcwoPAflYi7jJtpA==;10:Oh3HhnWtxaIWCwMJYIBBPgSdRitmlKIVQTSIMLlWwAkagFxHVIpADM68DTSVuA/dTSvURmjQeDAfDHi1veaFIu2wOds2DxpSek4dQaba1ns=;6:fEC2wLDcI4t9vYpoe0bAK8AXR+iv1dNa21/fZR6q+0xT7gfWYfMK2njeQaS21zW2 X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jun 2015 10:49:28.0053 (UTC) X-MS-Exchange-CrossTenant-Id: fde4dada-be84-483f-92cc-e026cbee8e96 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=fde4dada-be84-483f-92cc-e026cbee8e96;Ip=[165.204.84.222];Helo=[atltwp02.amd.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN1PR02MB070 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org MWAITX can enable a timer and a corresponding timer value specified in SW P0 clocks. The SW P0 frequency is the same with TSC. The timer provides an upper bound on how long the instruction waits before exiting. The implementation of delay function in kernel can lerverage the timer of MWAITX. This patch provides a new method (delay_mwaitx) to measure delay time. Suggested-by: Andy Lutomirski Suggested-by: Borislav Petkov Suggested-by: Peter Zijlstra Signed-off-by: Huang Rui --- arch/x86/include/asm/delay.h | 1 + arch/x86/include/asm/mwait.h | 3 +++ arch/x86/kernel/cpu/amd.c | 4 ++++ arch/x86/lib/delay.c | 45 ++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 53 insertions(+) diff --git a/arch/x86/include/asm/delay.h b/arch/x86/include/asm/delay.h index 9b3b4f2..36a760b 100644 --- a/arch/x86/include/asm/delay.h +++ b/arch/x86/include/asm/delay.h @@ -4,5 +4,6 @@ #include void use_tsc_delay(void); +void use_mwaitx_delay(void); #endif /* _ASM_X86_DELAY_H */ diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h index 1fbc89d..47f3540 100644 --- a/arch/x86/include/asm/mwait.h +++ b/arch/x86/include/asm/mwait.h @@ -14,6 +14,9 @@ #define CPUID5_ECX_INTERRUPT_BREAK 0x2 #define MWAIT_ECX_INTERRUPT_BREAK 0x1 +#define MWAITX_ECX_TIMER_ENABLE BIT(1) +#define MWAITX_MAX_LOOPS ((u32)-1) +#define MWAITX_DISABLE_CSTATES 0xf static inline void __monitor(const void *eax, unsigned long ecx, unsigned long edx) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 5bd3a99..1f0a8e2 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -11,6 +11,7 @@ #include #include #include +#include #ifdef CONFIG_X86_64 # include @@ -661,6 +662,9 @@ static void init_amd(struct cpuinfo_x86 *c) early_init_amd(c); + if (static_cpu_has_safe(X86_FEATURE_MWAITT)) + use_mwaitx_delay(); + /* * Bit 31 in normal CPUID used for nonstandard 3DNow ID; * 3DNow is IDd by bit 31 in extended CPUID (1*32+31) anyway diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c index 39d6a3d..035b6f6 100644 --- a/arch/x86/lib/delay.c +++ b/arch/x86/lib/delay.c @@ -20,6 +20,7 @@ #include #include #include +#include #ifdef CONFIG_SMP # include @@ -87,6 +88,45 @@ static void delay_tsc(unsigned long __loops) } /* + * On AMD platforms mwaitx has a configurable 32-bit timer, that counts + * with TSC frequency. And the input value is the loop of the counter, it + * will exit with the timer expired. + */ +static void delay_mwaitx(unsigned long __loops) +{ + u32 end, start, delay, loops = __loops; + + rdtsc_barrier(); + rdtscl(start); + + for (;;) { + delay = min(MWAITX_MAX_LOOPS, loops); + + /* + * Use cpu_tss as a cacheline-aligned, seldomly + * accessed per-cpu variable as the monitor target. + */ + __monitorx(this_cpu_ptr(&cpu_tss), 0, 0); + /* + * AMD, like Intel, supports the EAX hint and EAX=0xf + * means, do not enter any deep C-state and we use it + * here in delay() to minimize wakeup latency. + */ + __mwaitx(MWAITX_DISABLE_CSTATES, delay, MWAITX_ECX_TIMER_ENABLE); + + rdtsc_barrier(); + rdtscl(end); + + if (loops <= end - start) + break; + + loops -= end - start; + + start = end; + } +} + +/* * Since we calibrate only once at boot, this * function should be set once at boot and not changed */ @@ -97,6 +137,11 @@ void use_tsc_delay(void) delay_fn = delay_tsc; } +void use_mwaitx_delay(void) +{ + delay_fn = delay_mwaitx; +} + int read_current_timer(unsigned long *timer_val) { if (delay_fn == delay_tsc) { -- 1.9.1