From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753125AbdHQPxC (ORCPT ); Thu, 17 Aug 2017 11:53:02 -0400 Received: from mail.kernel.org ([198.145.29.99]:48028 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751572AbdHQPxA (ORCPT ); Thu, 17 Aug 2017 11:53:00 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 04FDC22C96 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=luto@kernel.org MIME-Version: 1.0 In-Reply-To: <20170817103514.GC27872@arm.com> References: <1502824706-30762-1-git-send-email-sai.praneeth.prakhya@intel.com> <1502824706-30762-4-git-send-email-sai.praneeth.prakhya@intel.com> <20170816095338.GB17270@leverpostej> <20170816100709.GG12845@arm.com> <20170816110321.GC17270@leverpostej> <20170816125715.GB3384@codeblueprint.co.uk> <20170815223541.GA25778@remoulade> <20170817103514.GC27872@arm.com> From: Andy Lutomirski Date: Thu, 17 Aug 2017 08:52:38 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 3/3] x86/efi: Use efi_switch_mm() rather than manually twiddling with cr3 To: Will Deacon Cc: Mark Rutland , Andy Lutomirski , Matt Fleming , Ard Biesheuvel , Sai Praneeth Prakhya , Peter Zijlstra , "linux-efi@vger.kernel.org" , "linux-kernel@vger.kernel.org" , joeyli , Borislav Petkov , "Michael S. Tsirkin" , "Neri, Ricardo" , "Ravi V. Shankar" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 17, 2017 at 3:35 AM, Will Deacon wrote: > On Tue, Aug 15, 2017 at 11:35:41PM +0100, Mark Rutland wrote: >> On Wed, Aug 16, 2017 at 09:14:41AM -0700, Andy Lutomirski wrote: >> > On Wed, Aug 16, 2017 at 5:57 AM, Matt Fleming wrote: >> > > On Wed, 16 Aug, at 12:03:22PM, Mark Rutland wrote: >> > >> >> > >> I'd expect we'd abort at a higher level, not taking any sample. i.e. >> > >> we'd have the core overflow handler check in_funny_mm(), and if so, skip >> > >> the sample, as with the skid case. >> > > >> > > FYI, this is my preferred solution for x86 too. >> > >> > One option for the "funny mm" flag would be literally the condition >> > current->mm != current->active_mm. I *think* this gets all the cases >> > right as long as efi_switch_mm is careful with its ordering and that >> > the arch switch_mm() code can handle the resulting ordering. (x86's >> > can now, I think, or at least will be able to in 4.14 -- not sure >> > about other arches). >> >> For arm64 we'd have to rework things a bit to get the ordering right >> (especially when we flip to/from the idmap), but otherwise this sounds sane to >> me. >> >> > That being said, there's a totally different solution: run EFI >> > callbacks in a kernel thread. This has other benefits: we could run >> > those callbacks in user mode some day, and doing *that* in a user >> > thread seems like a mistake. >> >> I think that wouldn't work for CPU-bound perf events (which are not >> ctx-switched with the task). >> >> It might be desireable to do that anyway, though. > > I'm still concerned that we're treating perf specially here -- are we > absolutely sure that nobody else is going to attempt user accesses off the > back of an interrupt? Reasonably sure? If nothing else, an interrupt taken while mmap_sem() is held for write that tries to access user memory is asking for serious trouble. There are still a few callers of pagefault_disable() and copy...inatomic(), though. > If not, then I'd much prefer a solution that catches > anybody doing that with the EFI page table installed, rather than trying > to play whack-a-mole like this. Using a kernel thread solves the problem for real. Anything that blindly accesses user memory in kernel thread context is terminally broken no matter what. > > Will From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andy Lutomirski Subject: Re: [PATCH 3/3] x86/efi: Use efi_switch_mm() rather than manually twiddling with cr3 Date: Thu, 17 Aug 2017 08:52:38 -0700 Message-ID: References: <1502824706-30762-1-git-send-email-sai.praneeth.prakhya@intel.com> <1502824706-30762-4-git-send-email-sai.praneeth.prakhya@intel.com> <20170816095338.GB17270@leverpostej> <20170816100709.GG12845@arm.com> <20170816110321.GC17270@leverpostej> <20170816125715.GB3384@codeblueprint.co.uk> <20170815223541.GA25778@remoulade> <20170817103514.GC27872@arm.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: In-Reply-To: <20170817103514.GC27872-5wv7dgnIgG8@public.gmane.org> Sender: linux-efi-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Will Deacon Cc: Mark Rutland , Andy Lutomirski , Matt Fleming , Ard Biesheuvel , Sai Praneeth Prakhya , Peter Zijlstra , "linux-efi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , joeyli , Borislav Petkov , "Michael S. Tsirkin" , "Neri, Ricardo" , "Ravi V. Shankar" List-Id: linux-efi@vger.kernel.org On Thu, Aug 17, 2017 at 3:35 AM, Will Deacon wrote: > On Tue, Aug 15, 2017 at 11:35:41PM +0100, Mark Rutland wrote: >> On Wed, Aug 16, 2017 at 09:14:41AM -0700, Andy Lutomirski wrote: >> > On Wed, Aug 16, 2017 at 5:57 AM, Matt Fleming wrote: >> > > On Wed, 16 Aug, at 12:03:22PM, Mark Rutland wrote: >> > >> >> > >> I'd expect we'd abort at a higher level, not taking any sample. i.e. >> > >> we'd have the core overflow handler check in_funny_mm(), and if so, skip >> > >> the sample, as with the skid case. >> > > >> > > FYI, this is my preferred solution for x86 too. >> > >> > One option for the "funny mm" flag would be literally the condition >> > current->mm != current->active_mm. I *think* this gets all the cases >> > right as long as efi_switch_mm is careful with its ordering and that >> > the arch switch_mm() code can handle the resulting ordering. (x86's >> > can now, I think, or at least will be able to in 4.14 -- not sure >> > about other arches). >> >> For arm64 we'd have to rework things a bit to get the ordering right >> (especially when we flip to/from the idmap), but otherwise this sounds sane to >> me. >> >> > That being said, there's a totally different solution: run EFI >> > callbacks in a kernel thread. This has other benefits: we could run >> > those callbacks in user mode some day, and doing *that* in a user >> > thread seems like a mistake. >> >> I think that wouldn't work for CPU-bound perf events (which are not >> ctx-switched with the task). >> >> It might be desireable to do that anyway, though. > > I'm still concerned that we're treating perf specially here -- are we > absolutely sure that nobody else is going to attempt user accesses off the > back of an interrupt? Reasonably sure? If nothing else, an interrupt taken while mmap_sem() is held for write that tries to access user memory is asking for serious trouble. There are still a few callers of pagefault_disable() and copy...inatomic(), though. > If not, then I'd much prefer a solution that catches > anybody doing that with the EFI page table installed, rather than trying > to play whack-a-mole like this. Using a kernel thread solves the problem for real. Anything that blindly accesses user memory in kernel thread context is terminally broken no matter what. > > Will