linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Thomas Garnier <thgarnie@google.com>
Cc: "Martin Schwidefsky" <schwidefsky@de.ibm.com>,
	"Heiko Carstens" <heiko.carstens@de.ibm.com>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Dave Hansen" <dave.hansen@intel.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"David Howells" <dhowells@redhat.com>,
	"René Nyffenegger" <mail@renenyffenegger.ch>,
	"Paul E . McKenney" <paulmck@linux.vnet.ibm.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Oleg Nesterov" <oleg@redhat.com>,
	"Stephen Smalley" <sds@tycho.nsa.gov>,
	"Pavel Tikhomirov" <ptikhomirov@virtuozzo.com>,
	"Ingo Molnar" <mingo@redhat.com>,
	"H . Peter Anvin" <hpa@zytor.com>,
	"Andy Lutomirski" <luto@kernel.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Kees Cook" <keescook@chromium.org>,
	"Rik van Riel" <riel@redhat.com>,
	"Josh Poimboeuf" <jpoimboe@redhat.com>,
	"Borislav Petkov" <bp@alien8.de>,
	"Brian Gerst" <brgerst@gmail.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	"Christian Borntraeger" <borntraeger@de.ibm.com>,
	"Russell King" <linux@armlinux.org.uk>,
	"Will Deacon" <will.deacon@arm.com>,
	"Catalin Marinas" <catalin.marinas@arm.com>,
	"Mark Rutland" <mark.rutland@arm.com>,
	"James Morse" <james.morse@arm.com>,
	linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-api@vger.kernel.org, x86@kernel.org,
	linux-arm-kernel@lists.infradead.org,
	kernel-hardening@lists.openwall.com
Subject: Re: [PATCH v7 1/4] syscalls: Restore address limit after a syscall
Date: Tue, 25 Apr 2017 08:33:05 +0200	[thread overview]
Message-ID: <20170425063305.hwjuxupa37rwe6zj@gmail.com> (raw)
In-Reply-To: <20170410164420.64003-1-thgarnie@google.com>


* Thomas Garnier <thgarnie@google.com> wrote:

> This patch ensures a syscall does not return to user-mode with a kernel
> address limit. If that happened, a process can corrupt kernel-mode
> memory and elevate privileges.

Don't start changelogs with 'This patch' - it's obvious that we are talking about 
this patch. Writing:

   Ensure that a syscall does not return to user-mode with a kernel address limit. 
   If that happens, a process can corrupt kernel-mode memory and elevate 
   privileges.

also note the spelling fix I did. (There's another spelling error elsewhere in 
this changelog as well.)

Please read changelogs!

> For example, it would mitigation this bug:
> 
> - https://bugs.chromium.org/p/project-zero/issues/detail?id=990
> 
> The CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE option is also
> added so each architecture can optimize this change.

As I pointed it out in my previous reply this Kconfig name is awfully long - but 
it should have been obvious when this changelog was written ...

> Signed-off-by: Thomas Garnier <thgarnie@google.com>
> Tested-by: Kees Cook <keescook@chromium.org>
> ---
> Based on next-20170410
> ---
>  arch/s390/Kconfig        |  1 +
>  include/linux/syscalls.h | 26 +++++++++++++++++++++++++-
>  init/Kconfig             |  6 ++++++
>  kernel/sys.c             | 13 +++++++++++++
>  4 files changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index d25435d94b6e..489a0cc6e46b 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -103,6 +103,7 @@ config S390
>  	select ARCH_INLINE_WRITE_UNLOCK_BH
>  	select ARCH_INLINE_WRITE_UNLOCK_IRQ
>  	select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE
> +	select ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
>  	select ARCH_SAVE_PAGE_KEYS if HIBERNATION
>  	select ARCH_SUPPORTS_ATOMIC_RMW
>  	select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index 980c3c9b06f8..801a7a74fe28 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -191,6 +191,27 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>  	SYSCALL_METADATA(sname, x, __VA_ARGS__)			\
>  	__SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>  
> +
> +/*
> + * Called before coming back to user-mode. Returning to user-mode with an
> + * address limit different than USER_DS can allow to overwrite kernel memory.
> + */
> +static inline void verify_pre_usermode_state(void) {
> +	BUG_ON(!segment_eq(get_fs(), USER_DS));
> +}

Non-standard coding style.

> +
> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> +#define __CHECK_USER_CALLER() \
> +	bool user_caller = segment_eq(get_fs(), USER_DS)
> +#define __VERIFY_PRE_USERMODE_STATE() \
> +	if (user_caller) verify_pre_usermode_state()
> +#else
> +#define __CHECK_USER_CALLER()
> +#define __VERIFY_PRE_USERMODE_STATE()
> +asmlinkage void address_limit_check_failed(void);
> +#endif
> +
> +
>  #define __PROTECT(...) asmlinkage_protect(__VA_ARGS__)
>  #define __SYSCALL_DEFINEx(x, name, ...)					\
>  	asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))	\
> @@ -199,7 +220,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>  	asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__));	\
>  	asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__))	\
>  	{								\
> -		long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__));	\
> +		long ret;						\
> +		__CHECK_USER_CALLER();					\
> +		ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__));	\
> +		__VERIFY_PRE_USERMODE_STATE();				\
>  		__MAP(x,__SC_TEST,__VA_ARGS__);				\
>  		__PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__));	\
>  		return ret;						\

BTW., the '__VERIFY_PRE_USERMODE_STATE()' name is highly misleading: the 'pre' 
prefix suggests that this is done before a system call - while it's done 
afterwards.

The solution is to not try to specify the exact call placement in the name, just 
describe the functionality (and harmonize along the common prefix).

> +config ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> +	bool
> +	help
> +	  Disable the generic pre-usermode state verification. Allow each
> +	  architecture to optimize how and when the verification is done.
> +

Please name the Kconfig symbols something like this:

	CONFIG_ADDR_LIMIT_CHECK
	CONFIG_ADDR_LIMIT_CHECK_ARCH

or so, which tells us whether the check is done by the architecture code, without 
breaking the col80 limit with a single Kconfig name.

BTW:

> +#ifdef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> +/*
> + * This function is called when an architecture specific implementation detected
> + * an invalid address limit. The generic user-mode state checker will finish on
> + * the appropriate BUG_ON.
> + */
> +asmlinkage void address_limit_check_failed(void)
> +{
> +	verify_pre_usermode_state();
> +	panic("address_limit_check_failed called with a valid user-mode state");

It's very unconstructive to unconditionally panic the system, just because some 
kernel code leaked the address limit! Do a warn-once printout and kill the current 
task (i.e. don't continue execution), but don't crash everything else!

Thanks,

	Ingo

  parent reply	other threads:[~2017-04-25  6:33 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-10 16:44 [PATCH v7 1/4] syscalls: Restore address limit after a syscall Thomas Garnier
2017-04-10 16:44 ` [PATCH v7 2/4] x86/syscalls: Architecture specific pre-usermode check Thomas Garnier
2017-04-10 16:44 ` [PATCH v7 3/4] arm/syscalls: " Thomas Garnier
2017-04-10 16:44 ` [PATCH v7 4/4] arm64/syscalls: " Thomas Garnier
2017-04-10 17:12   ` Catalin Marinas
2017-04-10 20:06     ` Thomas Garnier
2017-04-10 20:09       ` Thomas Garnier
2017-04-10 20:07     ` Thomas Garnier
2017-04-24 23:57 ` [PATCH v7 1/4] syscalls: Restore address limit after a syscall Kees Cook
2017-04-25  6:23   ` Ingo Molnar
2017-04-25 14:12     ` Thomas Garnier
2017-04-25  6:33 ` Ingo Molnar [this message]
2017-04-25 14:18   ` Thomas Garnier
2017-04-26  8:12     ` Ingo Molnar
2017-04-26 14:09       ` Thomas Garnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170425063305.hwjuxupa37rwe6zj@gmail.com \
    --to=mingo@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=borntraeger@de.ibm.com \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@intel.com \
    --cc=dhowells@redhat.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=hpa@zytor.com \
    --cc=james.morse@arm.com \
    --cc=jpoimboe@redhat.com \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=luto@kernel.org \
    --cc=mail@renenyffenegger.ch \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=ptikhomirov@virtuozzo.com \
    --cc=riel@redhat.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=sds@tycho.nsa.gov \
    --cc=tglx@linutronix.de \
    --cc=thgarnie@google.com \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).