From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751604AbdHPKHM (ORCPT ); Wed, 16 Aug 2017 06:07:12 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:33264 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751370AbdHPKHK (ORCPT ); Wed, 16 Aug 2017 06:07:10 -0400 Date: Wed, 16 Aug 2017 11:07:10 +0100 From: Will Deacon To: Mark Rutland Cc: Ard Biesheuvel , Andy Lutomirski , Sai Praneeth Prakhya , Peter Zijlstra , "linux-efi@vger.kernel.org" , "linux-kernel@vger.kernel.org" , joeyli , Borislav Petkov , "Michael S. Tsirkin" , "Neri, Ricardo" , Matt Fleming , "Ravi V. Shankar" Subject: Re: [PATCH 3/3] x86/efi: Use efi_switch_mm() rather than manually twiddling with cr3 Message-ID: <20170816100709.GG12845@arm.com> References: <1502824706-30762-1-git-send-email-sai.praneeth.prakhya@intel.com> <1502824706-30762-4-git-send-email-sai.praneeth.prakhya@intel.com> <20170816095338.GB17270@leverpostej> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170816095338.GB17270@leverpostej> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 16, 2017 at 10:53:38AM +0100, Mark Rutland wrote: > On Wed, Aug 16, 2017 at 10:31:12AM +0100, Ard Biesheuvel wrote: > > (+ Mark, Will) > > > > On 15 August 2017 at 22:46, Andy Lutomirski wrote: > > > On Tue, Aug 15, 2017 at 12:18 PM, Sai Praneeth Prakhya > > > wrote: > > >> +/* > > >> + * Makes the calling kernel thread switch to/from efi_mm context > > >> + * Can be used from SetVirtualAddressMap() or during efi runtime calls > > >> + * (Note: This routine is heavily inspired from use_mm) > > >> + */ > > >> +void efi_switch_mm(struct mm_struct *mm) > > >> +{ > > >> + struct task_struct *tsk = current; > > >> + > > >> + task_lock(tsk); > > >> + efi_scratch.prev_mm = tsk->active_mm; > > >> + if (efi_scratch.prev_mm != mm) { > > >> + mmgrab(mm); > > >> + tsk->active_mm = mm; > > >> + } > > >> + switch_mm(efi_scratch.prev_mm, mm, NULL); > > >> + task_unlock(tsk); > > >> + > > >> + if (efi_scratch.prev_mm != mm) > > >> + mmdrop(efi_scratch.prev_mm); > > > > > > I'm confused. You're mmdropping an mm that you are still keeping a > > > pointer to. This is also a bit confusing in the case where you do > > > efi_switch_mm(efi_scratch.prev_mm). > > > > > > This whole manipulation seems fairly dangerous to me for another > > > reason -- you're taking a user thread (I think) and swapping out its > > > mm to something that the user in question should *not* have access to. > > > What if a perf interrupt happens while you're in the alternate mm? > > > What if you segfault and dump core? Should we maybe just have a flag > > > that says "this cpu is using a funny mm", assert that the flag is > > > clear when scheduling, and teach perf, coredumps, etc not to touch > > > user memory when the flag is set? > > > > It appears we may have introduced this exact issue on arm64 and ARM by > > starting to run the UEFI runtime services with interrupts enabled. > > (perf does not use NMI on ARM, so the issue did not exist beforehand) > > > > Mark, Will, any thoughts? > > Yup, I can cause perf to take samples from the EFI FW code, so that's > less than ideal. But that should only happen if you're profiling EL1, right, which needs root privileges? (assuming the skid issue is solved -- not sure what happened to those patches after they broke criu). > The "funny mm" flag sounds like a good idea to me, though given recent > pain with sampling in the case of skid, I don't know exactly what we > should do if/when we take an overflow interrupt while in EFI. I don't think special-casing perf interrupts is the right thing to do here. If we're concerned about user-accesses being made off the back of interrupts taken whilst in EFI, then we should probably either swizzle back in the user page table on the IRQ path or postpone handling it until we're done with the firmware. Having a flag feels a bit weird: would the uaccess routines return -EFAULT if it's set? Will