From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754840AbbGJP12 (ORCPT ); Fri, 10 Jul 2015 11:27:28 -0400 Received: from mail-oi0-f51.google.com ([209.85.218.51]:35545 "EHLO mail-oi0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754775AbbGJP1R (ORCPT ); Fri, 10 Jul 2015 11:27:17 -0400 MIME-Version: 1.0 In-Reply-To: References: Date: Fri, 10 Jul 2015 11:27:16 -0400 Message-ID: Subject: Re: [RFC/PATCH 5/7] x86/vm86: Teach handle_vm86_trap to return to 32bit mode directly From: Brian Gerst To: Andy Lutomirski Cc: Andy Lutomirski , X86 ML , "linux-kernel@vger.kernel.org" , =?UTF-8?B?RnLDqWTDqXJpYyBXZWlzYmVja2Vy?= , Rik van Riel , Oleg Nesterov , Denys Vlasenko , Borislav Petkov , Kees Cook , Linus Torvalds Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 9, 2015 at 9:33 PM, Andy Lutomirski wrote: > On Thu, Jul 9, 2015 at 3:41 PM, Andy Lutomirski wrote: >> On Wed, Jul 8, 2015 at 12:24 PM, Andy Lutomirski wrote: >>> The TIF_NOTIFY_RESUME hack it was using was buggy and unsupportable. >>> vm86 mode was completely broken under ptrace, for example, because >>> we'd never make it to v8086 mode. >>> >>> This code is still a huge, scary mess, but at least it's no longer >>> tangled with the exit-to-userspace loop. >> >> This patch is incorrect. Brian, what's the ETA for your vm86 cleanup? >> If it's very soon, then I'll see if I can rely on it. If not, I'll >> have to come up with a way to fix this patch. >> >> Grr. The kernel state when handle_vm86_trap is called is absurd right >> now. Somehow we're supposed to survive do_trap, send a signal >> corresponding to the outside-vm86 state, and exit vm86 cleanly (with >> ax = 0), all before returning to user mode. I doubt these semantics >> are even intentional. >> >> This code sucks. > > OK, I have a version that seems to work. It comes with a much better > selftest, too. I'll send it shortly. > > Brian, would it make sense to base your work on top of it? > > Now that I've looked at this stuff, if I were designing Linux support > for v8086 mode, I'd do it very differently. There wouldn't be a vm86 > syscall at all. Instead you'd call sigaltstack, then raise a signal, > set X86_EFLAGS_VM, and return. > > The kernel would handle X86_EFLAGS_VM being set by setting TIF_V8086 > and adjusting sp0. On entry, TIF_V8086 would move the segment > registers from the hardware frame into pt_regs and, on exit, TIF_V8086 > would move them back. Clearing X86_EFLAGS_VM (via ptrace, signal > delivery, or sigreturn) would sanitize the segment registers. > > SYSENTER would be safe, so the SYSENTER_CS hack wouldn't be needed. > Of course, we'd lose the CPU state, so the user would have to be > careful. > > And that's it. There wouldn't be any emulation -- user code could > emulate syscalls all by itself in a signal handler. Exiting v8086 > mode would be straightforward -- just do anything that would raise a > signal. > > Of course, this isn't at all ABI-compatible with the current turd, and > v8086 mode isn't really that useful, so this is just idle retroactive > speculation. But the TIF_V8086 trick would still be useful to let us > get rid of all the awful hacks in the trap and exit code. I'll post my patches tonight when I get home. It would probably make more sense for you to base off mine, since it should eliminate the need for you to touch any vm86 code. I fixed the signal issue by checking if the VM flag is set in handle_signal(), and swap the register state there before pushing the signal frame, but that is only possible after removing the need to jump back into the exit asm routines. work_notifysig_v86 is gone too as a result. -- Brian Gerst