From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751650AbcFUTBg (ORCPT ); Tue, 21 Jun 2016 15:01:36 -0400 Received: from mail-lf0-f43.google.com ([209.85.215.43]:34503 "EHLO mail-lf0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751379AbcFUTBe (ORCPT ); Tue, 21 Jun 2016 15:01:34 -0400 MIME-Version: 1.0 In-Reply-To: <20160620175311.GA24505@redhat.com> References: <94bda8cd5f326ae5591c80fb5d7c1c22624accec.1466244711.git.luto@kernel.org> <20160619211906.GA14712@redhat.com> <20160620175311.GA24505@redhat.com> From: Kees Cook Date: Tue, 21 Jun 2016 12:01:32 -0700 X-Google-Sender-Auth: NMjp1haD4czzCVTd8dH0OTmN4Bo Message-ID: Subject: Re: the usage of __SYSCALL_MASK in entry_SYSCALL_64/do_syscall_64 is not consistent To: Oleg Nesterov Cc: Andy Lutomirski , Andy Lutomirski , X86 ML , "linux-kernel@vger.kernel.org" , Borislav Petkov Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 20, 2016 at 10:53 AM, Oleg Nesterov wrote: > On 06/19, Andy Lutomirski wrote: >> >> Something's clearly buggy there, > > The usage of __X32_SYSCALL_BIT doesn't look right too. Nothing serious > but still. > > Damn, initially I thought I have found the serious bug in entry_64.S > and it took me some time to understand why my exploit doesn't work ;) > So I learned that > > andl $__SYSCALL_MASK, %eax > > in entry_SYSCALL_64_fastpath() zero-extends %rax and thus > > cmpl $__NR_syscall_max, %eax > ... > call *sys_call_table(, %rax, 8) > > is correct (rax <= __NR_syscall_max). > > OK, so entry_64.S simply "ignores" the upper bits if CONFIG_X86_X32_ABI. > Fine, but this doesn't match the > > if (likely((nr & __SYSCALL_MASK) < NR_syscalls)) > > check in do_syscall_64(). So this test-case > > #include > > int main(void) > { > // __NR_exit == 0x3c > asm volatile ("movq $0xFFFFFFFF0000003c, %rax; syscall"); > > printf("I didn't exit because I am traced\n"); > > return 0; > } > > silently exits if not traced, otherwise it calls printf(). > > Should we do something or we do not care? The slow path has seccomp, so there's no filter bypass with this. I think it should get corrected, just for proper behavior, but it currently looks harmless. It does, technically, double the attack surface for userspace ROPish attacks since now the top half of the register can be F instead of 0, but that's probably not a very big deal. -Kees -- Kees Cook Chrome OS & Brillo Security