From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 683BFC433EF for ; Thu, 27 Jan 2022 06:36:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F17E06B0071; Thu, 27 Jan 2022 01:36:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EC6C56B0073; Thu, 27 Jan 2022 01:36:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB5CF6B0075; Thu, 27 Jan 2022 01:36:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay035.a.hostedemail.com [64.99.140.35]) by kanga.kvack.org (Postfix) with ESMTP id CC3236B0071 for ; Thu, 27 Jan 2022 01:36:24 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 9EA3D223B8 for ; Thu, 27 Jan 2022 06:36:24 +0000 (UTC) X-FDA: 79075107888.02.7D29A69 Received: from mail-pj1-f43.google.com (mail-pj1-f43.google.com [209.85.216.43]) by imf30.hostedemail.com (Postfix) with ESMTP id 478F280012 for ; Thu, 27 Jan 2022 06:36:24 +0000 (UTC) Received: by mail-pj1-f43.google.com with SMTP id z10-20020a17090acb0a00b001b520826011so6578720pjt.5 for ; Wed, 26 Jan 2022 22:36:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Rg2131bVbw63F3aHI1E1Qyk0ByMnG8ZrMzsox8Y41/U=; b=FtsdRwfbC8IG7UPbRSp92RJQy4S+lbtsgBaxGyWy0KcBrs7R1vIu4S1EyNjIgki690 riSp+B+orfVQkv/Ewe1JWSJ7lUs4+vCtKDpzNnjPII3v1idTFBCKI1zSwDIBL1as/46F QZ7romkbQiJCNBwwb4dRX7fU05VzcdQr0hcTN0eZc6h7Zxq1pbBjeMUFDcNiJ7JMasLC raKF3ISTGddsxXwKkpVLRP0w+V9iYB0QKAdnPSLPWqpa/9eGLM851RprsZ+XKf0+60Q1 a0QX94pZKvpD6i2BjteNU7gQ1DlHEgrCheJJYZfItpo9sKZlhLAfsjJIgIuR3jweXmzg vaVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Rg2131bVbw63F3aHI1E1Qyk0ByMnG8ZrMzsox8Y41/U=; b=5BDzo/tXgOPDyjPwtahq7y5MBhil0enkuqveMSNoCacrJVDcjasm22B6vA2idmpDWl b/3z8v7KmkAHpA2Oi43AfNXpdpBlc+ykjkPs/TLqD6zSb4CNhpFtT6bpLXq2n6H+nHD0 IGG1ThGI48lZV2jia6vyHbp2IAAD5+y/ssS2YnSyN5wFqO20UOD+9dpfTiUGWsOO6y6Z micp8XxQTHo/g6Qr5eVtI/s65TiHHI3t3NW7E1JG1C4J4/IyH5PZRuHAZcUjucjQhNvn Jj6Yhstdj2/ifAGsVdeNnQwKpd2lAuSSJBZ/Y2kRWyiwoUJy7bUqdFen4+s3u/gVSVlC YOnA== X-Gm-Message-State: AOAM531e5kXE4tPuFkRmImeXkKv4a7S4bnkQ1tU1raav92XI4NewDBoZ qm0YVBvFB671XSwVE0yG9EE6Fg== X-Google-Smtp-Source: ABdhPJyXOCYs9elTHCEIzax7TCHWJKTPOUu2c96OdSMJgwoaVb98VEWiBKlf/l6a9Ualt74KyFpeTA== X-Received: by 2002:a17:90b:1806:: with SMTP id lw6mr12492473pjb.82.1643265383168; Wed, 26 Jan 2022 22:36:23 -0800 (PST) Received: from google.com (157.214.185.35.bc.googleusercontent.com. [35.185.214.157]) by smtp.gmail.com with ESMTPSA id u18sm4889334pjn.49.2022.01.26.22.36.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 26 Jan 2022 22:36:22 -0800 (PST) Date: Thu, 27 Jan 2022 06:36:19 +0000 From: Sean Christopherson To: Peter Zijlstra Cc: mingo@redhat.com, tglx@linutronix.de, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-api@vger.kernel.org, x86@kernel.org, pjt@google.com, posk@google.com, avagin@google.com, jannh@google.com, tdelisle@uwaterloo.ca, mark.rutland@arm.com, posk@posk.io Subject: Re: [RFC][PATCH v2 4/5] x86/uaccess: Implement unsafe_try_cmpxchg_user() Message-ID: References: <20220120155517.066795336@infradead.org> <20220120160822.852009966@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 478F280012 X-Stat-Signature: ixe4m4xcpriw33aepoz6b6pwgefmo4it Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=FtsdRwfb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of seanjc@google.com designates 209.85.216.43 as permitted sender) smtp.mailfrom=seanjc@google.com X-Rspam-User: nil X-HE-Tag: 1643265384-902107 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jan 27, 2022, Sean Christopherson wrote: > Doh, I should have specified that KVM needs 8-byte CMPXCHG on 32-bit kernels due > to using it to atomically update guest PAE PTEs and LTR descriptors (yay). > > Also, KVM's use case isn't a tight loop, how gross would it be to add a slightly > less unsafe version that does __uaccess_begin_nospec()? KVM pre-checks the address > way ahead of time, so the access_ok() check can be omitted. Alternatively, KVM > could add its own macro, but that seems a little silly. E.g. somethign like this, > though I don't think this is correct *sigh* Finally realized I forgot to add back the page offset after converting from guest page frame to host virtual address. Anyways, this is what I ended up with, will test more tomorrow. From: Peter Zijlstra Date: Thu, 20 Jan 2022 16:55:21 +0100 Subject: [PATCH] x86/uaccess: Implement unsafe_try_cmpxchg_user() Do try_cmpxchg() loops on userspace addresses. Provide 8-byte versions for 32-bit kernels so that KVM can do cmpxchg on guest PAE PTEs, which are accessed via userspace addresses. Signed-off-by: Peter Zijlstra (Intel) Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson --- arch/x86/include/asm/uaccess.h | 129 +++++++++++++++++++++++++++++++++ 1 file changed, 129 insertions(+) diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index ac96f9b2d64b..b706008aed28 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -342,6 +342,45 @@ do { \ : [umem] "m" (__m(addr)) \ : : label) +#define __try_cmpxchg_user_asm(itype, ltype, _ptr, _pold, _new, label) ({ \ + bool success; \ + __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \ + __typeof__(*(_ptr)) __old = *_old; \ + __typeof__(*(_ptr)) __new = (_new); \ + asm_volatile_goto("\n" \ + "1: " LOCK_PREFIX "cmpxchg"itype" %[new], %[ptr]\n"\ + _ASM_EXTABLE_UA(1b, %l[label]) \ + : CC_OUT(z) (success), \ + [ptr] "+m" (*_ptr), \ + [old] "+a" (__old) \ + : [new] ltype (__new) \ + : "memory" \ + : label); \ + if (unlikely(!success)) \ + *_old = __old; \ + likely(success); }) + +#ifdef CONFIG_X86_32 +#define __try_cmpxchg64_user_asm(_ptr, _pold, _new, label) ({ \ + bool success; \ + __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \ + __typeof__(*(_ptr)) __old = *_old; \ + __typeof__(*(_ptr)) __new = (_new); \ + asm_volatile_goto("\n" \ + "1: " LOCK_PREFIX "cmpxchg8b %[ptr]\n" \ + _ASM_EXTABLE_UA(1b, %l[label]) \ + : CC_OUT(z) (success), \ + "+A" (__old), \ + [ptr] "+m" (*_ptr) \ + : "b" ((u32)__new), \ + "c" ((u32)((u64)__new >> 32)) \ + : "memory" \ + : label); \ + if (unlikely(!success)) \ + *_old = __old; \ + likely(success); }) +#endif // CONFIG_X86_32 + #else // !CONFIG_CC_HAS_ASM_GOTO_OUTPUT #ifdef CONFIG_X86_32 @@ -407,6 +446,57 @@ do { \ : [umem] "m" (__m(addr)), \ "0" (err)) +#define __try_cmpxchg_user_asm(itype, ltype, _ptr, _pold, _new, label) ({ \ + int __err = 0; \ + bool success; \ + __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \ + __typeof__(*(_ptr)) __old = *_old; \ + __typeof__(*(_ptr)) __new = (_new); \ + asm volatile("\n" \ + "1: " LOCK_PREFIX "cmpxchg"itype" %[new], %[ptr]\n"\ + CC_SET(z) \ + "2:\n" \ + _ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_EFAULT_REG, \ + %[errout]) \ + : CC_OUT(z) (success), \ + [errout] "+r" (__err), \ + [ptr] "+m" (*_ptr), \ + [old] "+a" (__old) \ + : [new] ltype (__new) \ + : "memory", "cc"); \ + if (unlikely(__err)) \ + goto label; \ + if (unlikely(!success)) \ + *_old = __old; \ + likely(success); }) + +#ifdef CONFIG_X86_32 +#define __try_cmpxchg64_user_asm(_ptr, _pold, _new, label) ({ \ + int __err = 0; \ + bool success; \ + __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \ + __typeof__(*(_ptr)) __old = *_old; \ + __typeof__(*(_ptr)) __new = (_new); \ + asm volatile("\n" \ + "1: " LOCK_PREFIX "cmpxchg8b %[ptr]\n" \ + CC_SET(z) \ + "2:\n" \ + _ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_EFAULT_REG, \ + %[errout]) \ + : CC_OUT(z) (success), \ + [errout] "+r" (__err), \ + "+A" (__old), \ + [ptr] "+m" (*_ptr) \ + : "b" ((u32)__new), \ + "c" ((u32)((u64)__new >> 32)) \ + : "memory", "cc"); \ + if (unlikely(__err)) \ + goto label; \ + if (unlikely(!success)) \ + *_old = __old; \ + likely(success); }) +#endif // CONFIG_X86_32 + #endif // CONFIG_CC_HAS_ASM_GOTO_OUTPUT /* FIXME: this hack is definitely wrong -AK */ @@ -501,6 +591,45 @@ do { \ } while (0) #endif // CONFIG_CC_HAS_ASM_GOTO_OUTPUT +extern void __try_cmpxchg_user_wrong_size(void); + +#ifndef CONFIG_X86_32 +#define __try_cmpxchg64_user_asm(_ptr, _oldp, _nval, _label) \ + __try_cmpxchg_user_asm("q", "r", (_ptr), (_oldp), (_nval), _label) +#endif + +#define unsafe_try_cmpxchg_user(_ptr, _oldp, _nval, _label) ({ \ + bool __ret; \ + switch (sizeof(*(_ptr))) { \ + case 1: __ret = __try_cmpxchg_user_asm("b", "q", \ + (_ptr), (_oldp), \ + (_nval), _label); \ + break; \ + case 2: __ret = __try_cmpxchg_user_asm("w", "r", \ + (_ptr), (_oldp), \ + (_nval), _label); \ + break; \ + case 4: __ret = __try_cmpxchg_user_asm("l", "r", \ + (_ptr), (_oldp), \ + (_nval), _label); \ + break; \ + case 8: __ret = __try_cmpxchg64_user_asm((_ptr), (_oldp), \ + (_nval), _label); \ + break; \ + default: __try_cmpxchg_user_wrong_size(); \ + } \ + __ret; }) + +/* "Returns" 0 on success, 1 on failure, -EFAULT if the access faults. */ +#define __try_cmpxchg_user(_ptr, _oldp, _nval, _label) ({ \ + int __ret = -EFAULT; \ + __uaccess_begin_nospec(); \ + __ret = !unsafe_try_cmpxchg_user(_ptr, _oldp, _nval, _label); \ +_label: \ + __uaccess_end(); \ + __ret; \ + }) + /* * We want the unsafe accessors to always be inlined and use * the error labels - thus the macro games. --