From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4265FC433EF for ; Mon, 27 Sep 2021 19:08:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 27C8460F92 for ; Mon, 27 Sep 2021 19:08:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235988AbhI0TJo (ORCPT ); Mon, 27 Sep 2021 15:09:44 -0400 Received: from mga02.intel.com ([134.134.136.20]:49945 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235964AbhI0TJn (ORCPT ); Mon, 27 Sep 2021 15:09:43 -0400 X-IronPort-AV: E=McAfee;i="6200,9189,10120"; a="211794152" X-IronPort-AV: E=Sophos;i="5.85,327,1624345200"; d="scan'208";a="211794152" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 12:07:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.85,327,1624345200"; d="scan'208";a="615940061" Received: from orsmsx606.amr.corp.intel.com ([10.22.229.19]) by fmsmga001.fm.intel.com with ESMTP; 27 Sep 2021 12:07:56 -0700 Received: from orsmsx607.amr.corp.intel.com (10.22.229.20) by ORSMSX606.amr.corp.intel.com (10.22.229.19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.12; Mon, 27 Sep 2021 12:07:56 -0700 Received: from ORSEDG601.ED.cps.intel.com (10.7.248.6) by orsmsx607.amr.corp.intel.com (10.22.229.20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.12 via Frontend Transport; Mon, 27 Sep 2021 12:07:56 -0700 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (104.47.59.174) by edgegateway.intel.com (134.134.137.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2242.12; Mon, 27 Sep 2021 12:07:41 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=NEBXdX5zWQrFiGsv8TjsnBwRmoIe0e6zGdAWql7ZaZqWdIeP7KLMYdgaIJ0RufSOHnqIVQdZl2ddYBKYAUdTjkQuOs9UReSLAwgfLZ7lo2VHW3NWYkzRQgtzhtpAN194GG2tuSm0l86XAURuaZR27qjOEGFkRU8kMohjHssKgJ2uP4dtz7qTH8sWQm8eSvtAAET+t7pTVh365p5mAXtqbqz9UawWKmACJ7847fKVO0KTLrPaiXDElhpshbPU2ubbhABzZjEgXSnlskgXLN216ATNPPqBtEFQZBw715mm0cnB4trBBkulzuJ84Ixii8cUsCrMMGJqckA0kB+FxA/0iQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=SztoQAuzqCwtRCrJ5ADxd4LVqOTSnbIXxse8p4Z/lws=; b=lDDJxmlorAb9gxE7jZ63bqZvyZcsl/KFcbOzXBVEIf8dRyfzn+GfUYOsr24NWjQzQdIo5dasgYTXcKAr4wqqAA9QojyEGJ3EHgrV7kRA8vCXrvrhSZ6l0N2kLiaWACPQsoLqwH6RU++BQLyyNY27YSc1FUvp01l1otOeTS8f8MArBAFsjBXDR5oEXdWCmBCPfRhcfHeuHvb4YJqN28IgP0i53R9t6meK6N8MSX87xBy95rdVnnp/aRFQ/+xv5fyGtVTPkXlQW41cXhnz2abU3GLh+OuasDr2gyzujlYzWzppArDNp+BpWDiezYwkXn8MTKK/54OpJ3s/xm5lobQcbQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=SztoQAuzqCwtRCrJ5ADxd4LVqOTSnbIXxse8p4Z/lws=; b=oPSRki4VDD4xLGCE1VBpi9uURlphT48j+ajmvIPeRfltRi536wv+They3eTtBhDqM6ekB0TiMOcm7ImuOFZt2NOVm5hqa5/fnqM7q3fgovN9NxorWNxr3hwYDIiwcmV2Le0MX/0BBqn5lzNbx6rTjuejtcs1votzIXkkUhp5NFw= Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=intel.com; Received: from BYAPR11MB3320.namprd11.prod.outlook.com (2603:10b6:a03:18::25) by BYAPR11MB2759.namprd11.prod.outlook.com (2603:10b6:a02:c8::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4544.15; Mon, 27 Sep 2021 19:07:38 +0000 Received: from BYAPR11MB3320.namprd11.prod.outlook.com ([fe80::4167:f9ef:19b2:eaff]) by BYAPR11MB3320.namprd11.prod.outlook.com ([fe80::4167:f9ef:19b2:eaff%3]) with mapi id 15.20.4544.021; Mon, 27 Sep 2021 19:07:38 +0000 Subject: Re: [RFC PATCH 05/13] x86/irq: Reserve a user IPI notification vector To: Thomas Gleixner , CC: Tony Luck , Dave Hansen , "Ingo Molnar" , Borislav Petkov , "H . Peter Anvin" , Andy Lutomirski , Jens Axboe , Christian Brauner , Peter Zijlstra , Shuah Khan , Arnd Bergmann , Jonathan Corbet , Ashok Raj , Jacob Pan , "Gayatri Kammela" , Zeng Guang , "Dan Williams" , Randy E Witt , Ravi V Shankar , Ramesh Thomas , , , , References: <20210913200132.3396598-1-sohil.mehta@intel.com> <20210913200132.3396598-6-sohil.mehta@intel.com> <87fstugabg.ffs@tglx> <878rzkeq9f.ffs@tglx> <87bl4fcxz8.ffs@tglx> From: Sohil Mehta Message-ID: <447377f0-21e5-067d-55ac-cb2eeca7ceae@intel.com> Date: Mon, 27 Sep 2021 12:07:36 -0700 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0 Thunderbird/78.14.0 In-Reply-To: <87bl4fcxz8.ffs@tglx> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-ClientProxiedBy: SJ0PR03CA0055.namprd03.prod.outlook.com (2603:10b6:a03:33e::30) To BYAPR11MB3320.namprd11.prod.outlook.com (2603:10b6:a03:18::25) MIME-Version: 1.0 Received: from [192.168.86.37] (73.222.31.188) by SJ0PR03CA0055.namprd03.prod.outlook.com (2603:10b6:a03:33e::30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4544.13 via Frontend Transport; Mon, 27 Sep 2021 19:07:37 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 958acfc1-022d-4476-70ac-08d981ea1279 X-MS-TrafficTypeDiagnostic: BYAPR11MB2759: X-LD-Processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr,ExtFwd X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: wHvtFfFezoSUeXIAeoVvnMyVt62UBZqgn2Y229oUjdSpMHiMpW6YgwaWJ7S+1cj+FD8CQyEL0wPJymPYQyPfCotv3xGPEKFckSh54gSP8WNEUoRw/kYhUOVKIdgJ2bgeV8qlsBw3Mzd+d5PyFE2uZRtzr3APYFPJ+N9OhCh4CEgMFAIhqH73QMGABnNVY/uHftzUHPBSqbTHaxZZlyrBAf7sn7wp5JdIsTYKQc7IkaBeezo9G7Bp3fu8Yh9Jk1Uys8NwXZT2KVJhBj/VlqNSDVC7O02AEeHRu8gyiQvuwEteB0Jtcfwv7jE0LRxjVU0G4hDIV+QtUsfxHTKPhkI2Xd9LssgYdLzZ5uY0xZZXrLKwx4pQZ1lS8tDbm7pJDD8tNXBx+DhuBpv2eoHVThlIKNQdUAC3lMpYBGu4FEzOIcA2GZyvAZlduYUzmATRRM98/q1vsL9o0zfV+HSl+6CR2igQchvwvtkhJ9tp+ofHy055yl97gu+2PtpTIPB7mnJ5jpS57EXGjpjqhDX/YvY6W85hEF+K4ztsoWUgF2Msf/dVdA52HvM12UOv2U2WNMvNvuTYSK3VAOD/PFK5O9Sl/hwLZxlVXnk+qOP7BNzRpcUxtmPA5ZPDUaDNVbcvtBEK6wIIoONLUfqNSBf7CspJjldY+8Hy2SMZFyDlo/CInR24Q9j+JEWGA9sIRYdx+msLvUMJi8N36IHIqyqsrMu3aTQpTbKvdwvJs3/IfBq142s= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BYAPR11MB3320.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(4636009)(366004)(31696002)(53546011)(8936002)(38100700002)(508600001)(2616005)(6486002)(26005)(16576012)(44832011)(316002)(83380400001)(7416002)(956004)(66946007)(4326008)(86362001)(66556008)(54906003)(186003)(66476007)(2906002)(36756003)(5660300002)(15650500001)(8676002)(31686004)(43740500002)(45980500001);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?SC9oT1FkcGpobHh1aHhHaHJmdXZySkRRU0lUWGhCSTdRUzFVS3FkMkhZZWZ0?= =?utf-8?B?cUF6eGNwbkR0SWdRbDJ0SHkwVFFvb2Rna1VqWU4wczd5VGx3Y3BRRmFLWUVW?= =?utf-8?B?NEtxWS9jakNEWGpLeGJuZWdQeWRZdzNpbGpLcWRDWkMxWkJUUExwRkl4OCtF?= =?utf-8?B?c3U3NWVGbkFGRzlpdFZNWnA2S0NkaUtGdDlidFp1cXFMYmtBbU5wd3hZclZX?= =?utf-8?B?c0JXZXJ1T0pZeWFxcVVlOElrd0pQTGZCUWdGSFBXOXF3RXBacVI2dkdqcmw4?= =?utf-8?B?U00xN3lNYkhtYWJoa0dLT3lNNkRNemNCbXpkOXo1YVlsMHdpMW1wUXVLdVY5?= =?utf-8?B?Y2hQbDNEckkxVk52ZUNKYzIweFFaazdIMVg3QnZqaGVwaEMwSU0wcldFb2Uy?= =?utf-8?B?OFJOSGJCS3JZd1RsQ2pLWXRYMUhMTXd5eFhqbCtHQUVKK0lCdjBITEpKSmpk?= =?utf-8?B?c3QydU9mWlZONkhyemVNQ3BlY3JLbDVFeFYrVzBoN3JlM0pNZURVTFl4T0NI?= =?utf-8?B?TGZveENMMDJjQ1JvOVdhSjBFK3h6ZnFqbFJILzI1RkhHN0lEWFlsQ3d1Q2l4?= =?utf-8?B?bEhVb2g5UFR4c0pLNVJ6ZkJwY2YwZEhQblIxSm5Gd2R5bnJNVUg2M0NmZG9q?= =?utf-8?B?M3VkaXljVmJwRXRBYkxicEtwcVBlSmwyVHVVMTQrMXkvWkV1QTZEU2VrR3cz?= =?utf-8?B?bGxwL2lyZzBkcXVQclY2dG9wVk4yR0NoMFczdklETWhFOUlZdHZUZVRCNmg2?= =?utf-8?B?Wk13OVo1ck1NREU5ZFY2QlBRYmorWGRjcmJZVE03dEJ5U3p1cXN5dGo1eWxU?= =?utf-8?B?dEU1a0htcXNvTEpNQWJMd21MY0U3aWQ3aHdobkFENVpnZE5QRWlhekpGYmVY?= =?utf-8?B?ZXducW1hUVhjd3RpNzlnQzl4SkFLczFqVVJ5ZnVZY2pYSHZrRWlsMXRIOCtl?= =?utf-8?B?R3ZWM0daU0pXa2ltMzNnMldVK1JxUjhuNXRYV21hQm80cFlnNnRoeVo1cHRJ?= =?utf-8?B?cFpPcmVBeVQ3U2VUVVM3TGgyNSt1L1VjUDZEeGxoeXlVZHZMK3hWc1ZrcVln?= =?utf-8?B?bHlXdHlhOWkyc24zNjRzK2dEZ2h2dTZlbnJYTkU3SUxoYVhLM2RaQ1NweTFP?= =?utf-8?B?UGx5YnpDaVJLaGV4OWFDbUxUZ2taK3lSN3hSNHl4MmR1NmZiYmxlNGJEYkYy?= =?utf-8?B?MlZPLzFlSE1sZUUvek9HS010K3EvVkpJNnR4cDdpNlh0WFRTT3NMcEI2aGlV?= =?utf-8?B?TFZFL0xNMFdITU1hajlrbzNNV2ZmMmNtL3JONGhVNkQ2UmpydmlSWkNZNnZH?= =?utf-8?B?b1ptSFJ6QkdzZkNucUdHNFQrdFNWaWF3SWJqSFo2ZnQ4VWZrVG44b3lDOFlq?= =?utf-8?B?UGljSENEZktVRkNPcWxrKzFQSzJ4M29zOU1xOEFkT0pXU2poTHlMaUFvMS91?= =?utf-8?B?anhDMVlMbWpYWExNRklhTWJKcUppK1Y5UnlWdk5qUHRsN3NYV2NMV3Ztd1E3?= =?utf-8?B?L2hOMGVvREVrQjhHaEFuckNtUVJrWUNvOHYvU1dZUVBsMkdRSk5yNmRWaFA3?= =?utf-8?B?QkQvaUNCdVBpeW9HTGtmaWpFYm9FUER0dWNkV041SEttLzhLWTMyV1Q0aUJK?= =?utf-8?B?dTR5Yy9ybGdodU1MVjI5cGRyL2pjMGk0ZjQyb3ZLT2Z1Z29tbHZDSTdxY3Q5?= =?utf-8?B?Ti9pbGtPWnlqVmFVM3JLd0tLMVdaLy9IcTh3bTBhU25LUmxjdmhYSUtXVjBl?= =?utf-8?Q?C0hHJqocZJRH/t6SykWXBatQguqupY/MKklE9YB?= X-MS-Exchange-CrossTenant-Network-Message-Id: 958acfc1-022d-4476-70ac-08d981ea1279 X-MS-Exchange-CrossTenant-AuthSource: BYAPR11MB3320.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Sep 2021 19:07:38.5023 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: xKfPpY4idVQgna6W9vw6WDOQ9owhzb8BN3Mv6MtzYurNg+PvaCJAWnAst21Ig1sZSFg1t2r6Zre7yYigTeG4PA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR11MB2759 X-OriginatorOrg: intel.com Precedence: bulk List-ID: X-Mailing-List: linux-arch@vger.kernel.org On 9/26/2021 5:39 AM, Thomas Gleixner wrote: > On Sat, Sep 25 2021 at 15:30, Thomas Gleixner wrote: >> On Fri, Sep 24 2021 at 01:07, Thomas Gleixner wrote: >> The obvious question is: What is the value of clearing UINV? >> >> Absolutely none. That notification vector cannot be used for anything >> else, so why would the OS be interested to see it ever? This is about >> user space interupts, right? >> >> UINV should be set _ONCE_ when CR4.UINTR is enabled and not be touched >> by XSAVES/XRSTORS at all. Any delivery of this vector to the OS should >> be considered a hardware bug. > After decoding the documentation (sigh) and staring at the implications of > keeping UINV armed, I can see the point vs. the UPID lifetime issue when > a task gets scheduled out and migrated to a different CPU. I think you got it right. Here is my understanding of this. The User-interrupt notification processing moves all the pending interrupts from UPID.PIR to the UIRR. As you mentioned below, XSTATE is saved due to several reasons which saves the UIRR into memory. UIRR should no longer be updated after it has been saved. XSAVES clears UINV is to stop detecting additional interrupts in the UIRR after it has been saved. > Not the most pretty solution, but as there needs to be some invalidation > which needs to be undone on return to user space it probably does not > matter much. > > As the whole thing is tightly coupled to XSAVES/RSTORS we need to > integrate it into that machinery and not pretend that it's something > half independent. I agree. Thank you for pointing this out. > That means we have to handle the setting of the SN bit in UPID whenever > XSTATE is saved either during context switch, when the kernel uses the > FPU or in other places (signals, fpu_clone ...). They all end up in > save_fpregs_to_fpstate() so that might be the place to look at.  Yes. The current code doesn't do this. SN bit should be set whenever UINTR XSTATE is saved. > While talking about that: fpu_clone() has to invalidate the UINTR state > in the clone's xstate after the memcpy() or xsaves() operation. > > Also the restore portion on the way back to user space has to be coupled > more tightly: > > arch_exit_to_user_mode_prepare() > { > ... > if (unlikely(ti_work & _TIF_UPID)) > uintr_restore_upid(ti_work & _TIF_NEED_FPU_LOAD); > if (unlikely(ti_work & _TIF_NEED_FPU_LOAD)) > switch_fpu_return(); > } I am assuming _TIF_UPID would be set everytime SN is set and XSTATE is saved. > upid_set_ndst(upid) > { > apicid = __this_cpu_read(x86_cpu_to_apicid); > > if (x2apic_enabled()) > upid->ndst.x2apic = apicid; > else > upid->ndst.apic = apicid; > } > > uintr_restore_upid(bool xrstors_pending) > { > clear_thread_flag(TIF_UPID); > > // Update destination > upid_set_ndst(upid); > > // Do we need something stronger here? > barrier(); > > clear_bit(SN, upid->status); > > // Any SENDUIPI after this point sends to this CPU > > // Any bit which was set in upid->pir after SN was set > // and/or UINV was cleared by XSAVES up to the point > // where SN was cleared above is not reflected in UIRR. > > // As this runs with interrupts disabled the current state > // of upid->pir can be read and used for restore. A SENDUIPI > // which sets a bit in upid->pir after that read will send > // the notification vector which is going to be handled once > // the task reenables interrupts on return to user space. > // If the SENDUIPI set the bit before the read then the > // notification vector handling will just observe the same > // PIR state. > > // Needs to be a locked access as there might be a > // concurrent SENDUIPI modiying it. > pir = read_locked(upid->pir); > > if (xrstors_pending)) { > // Update the saved xstate for xrstors > current->xstate.uintr.uinv = UINTR_NOTIFICATION_VECTOR; XSAVES saves the UINV value into the XSTATE buffer. I am not sure if we need this again. Is it because it could have been overwritten by calling XSAVES twice? > current->xstate.uintr.uirr = pir; I believe PIR should be ORed. There could be some bits already set in the UIRR. Also, shouldn't UPID->PIR be cleared? If not, we would detect these interrupts all over again during the next ring transition. > } else { > // Manually restore UIRR and UINV > wrmsrl(IA32_UINTR_RR, pir); I believe read-modify-write here as well. > misc.val64 = 0; > misc.uittsz = current->uintr->uittsz; > misc.uinv = UINTR_NOTIFICATION_VECTOR; > wrmsrl(IA32_UINTR_MISC, misc.val64); Thanks! This helps reduce the additional MSR read. > } > } > > That's how I deciphered the documentation and I don't think this is far > from reality, but I might be wrong as usual. > > Hmm? Thank you for the simplification. This is very helpful. Sohil