All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Eric Biggers <ebiggers3@gmail.com>
Cc: kernel-hardening@lists.openwall.com, keescook@chromium.org,
	arnd@arndb.de, tglx@linutronix.de, mingo@redhat.com,
	h.peter.anvin@intel.com, will.deacon@arm.com, dwindsor@gmail.com,
	gregkh@linuxfoundation.org, ishkamiel@gmail.com,
	Elena Reshetova <elena.reshetova@intel.com>
Subject: Re: [kernel-hardening] [RFC PATCH 06/19] Provide refcount_t, an atomic_t like primitive built just for refcounting.
Date: Tue, 3 Jan 2017 14:21:36 +0100	[thread overview]
Message-ID: <20170103132136.GV3107@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20161230010627.GA9882@zzz>

On Thu, Dec 29, 2016 at 07:06:27PM -0600, Eric Biggers wrote:
> 
> ... and refcount_inc() compiles to over 100 bytes of instructions on x86_64.
> This is the wrong approach.  We need a low-overhead solution, otherwise no one
> will turn on refcount protection and the feature will be useless.

Its not something that can be turned on or off, refcount_t is
unconditional code. But you raise a good point on the size of the thing.

I count 116 bytes on x86_64. If I disable all the debug crud that
reduces to 45 bytes, and that's because GCC-6 is generating particularly
stupid code (albeit not as stupid as GCC-5 did).

A hand coded one ends up being 29 bytes, of which 6 are the function
pro- and epilogue, so effectively 23 bytes for inline.

Which we can reduce to 22 if instead of using a literal for UINT_MAX we
do: xor %[t], %[t]; dec %[t].

0000000000009ee0 <ponies>:
    9ee0:       55                      push   %rbp
    9ee1:       48 89 e5                mov    %rsp,%rbp
    9ee4:       8b 07                   mov    (%rdi),%eax
    9ee6:       85 c0                   test   %eax,%eax
    9ee8:       74 10                   je     9efa <ponies+0x1a>
    9eea:       89 c2                   mov    %eax,%edx
    9eec:       ff c2                   inc    %edx
    9eee:       73 04                   jae    9ef4 <ponies+0x14>
    9ef0:       31 d2                   xor    %edx,%edx
    9ef2:       ff ca                   dec    %edx
    9ef4:       f0 0f b1 17             lock cmpxchg %edx,(%rdi)
    9ef8:       75 ec                   jne    9ee6 <ponies+0x6>
    9efa:       5d                      pop    %rbp
    9efb:       c3			retq

(caveat: I wrote this on a post-holidays brain without testing)

Also note that call overhead on an x86_64 (big core) is something like
1.5 cycles. And afaik Sparc64 is the architecture with the worst call
overhead, but that already has its atomic functions out of line for
different reasons.

> What exactly is wrong with the current solution in PAX/grsecurity?  Looking at
> the x86 version they have atomic_inc() do 'lock incl' like usual, then use 'jo'
> to, if the counter overflowed, jump to *out-of-line* error handling code, in a
> separate section of the kernel image.   Then it raises a software interrupt, and
> the interrupt handler sets the overflowed counter to INT_MAX and does the needed
> logging and signal raising.

Doing an unconditional INC on INT_MAX gives a temporarily visible
artifact of INT_MAX+1 (or INT_MIN) in the best case.

This is fundamentally not an atomic operation and therefore does not
belong in the atomic_* family, full stop.

Not to mention that the whole wrap/unwrap or checked/unchecked split of
atomic_t is a massive trainwreck. Moving over to refcount_t, which has
simple and well defined semantics forces us to audit and cleanup all the
reference counting crud as it gets converted, this is a good thing.

Yes it takes more time and effort, but the end result is better code.

I understand why PaX/grsecurity chose not to do this, but that doesn't
make it a proper solution for upstream.


Now as to why refcount cannot be implemented using that scheme you
outlined:

	vCPU0			vCPU1

	lock inc %[r]
	jo

	<vcpu preempt-out>

				for lots
					refcount_dec_and_test(&obj->ref)
						/* hooray, we hit 0 */
						kfree(obj);

	<vcpu preempt-in>

	mov $0xFFFFFFFF, %[r] /* OOPS use-after-free */


Is this unlikely, yes, extremely so. Do I want to be the one debugging
this, heck no.

  parent reply	other threads:[~2017-01-03 13:21 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-29  6:55 [kernel-hardening] [RFC PATCH 00/19] refcount_t API + usage Elena Reshetova
2016-12-29  6:55 ` [kernel-hardening] [RFC PATCH 01/19] Since we need to change the implementation, stop exposing internals. Provide kref_read() to read the current reference count; typically used for debug messages Elena Reshetova
2016-12-29 16:41   ` [kernel-hardening] " Greg KH
2016-12-29 16:49     ` Reshetova, Elena
2016-12-30  7:58       ` Greg KH
2016-12-30 12:50         ` Reshetova, Elena
2016-12-29  6:55 ` [kernel-hardening] [RFC PATCH 02/19] By general sentiment kref_sub() is a bad interface, make it go away Elena Reshetova
2016-12-29  6:55 ` [kernel-hardening] [RFC PATCH 03/19] For some obscure reason apparmor thinks its needs to locally implement kref primitives that already exist. Stop doing this Elena Reshetova
2016-12-29  6:55 ` [kernel-hardening] [RFC PATCH 04/19] Because home-rolling your own is _awesome_, stop doing it. Provide kref_put_lock(), just like kref_put_mutex() but for a spinlock Elena Reshetova
2016-12-29  6:55 ` [kernel-hardening] [RFC PATCH 05/19] Leak references by unbalanced get, instead of poking at kref implementation details Elena Reshetova
2016-12-29  6:55 ` [kernel-hardening] [RFC PATCH 06/19] Provide refcount_t, an atomic_t like primitive built just for refcounting Elena Reshetova
2016-12-30  1:06   ` Eric Biggers
2016-12-30 13:17     ` Reshetova, Elena
2016-12-30 19:52       ` Eric Biggers
2017-01-03 13:21     ` Peter Zijlstra [this message]
2017-01-04 20:36       ` Eric Biggers
2017-01-05 10:44         ` Peter Zijlstra
2017-01-05 21:21       ` PaX Team
2017-01-20 10:35         ` Greg KH
2017-01-20 13:10         ` Peter Zijlstra
2016-12-29  6:55 ` [kernel-hardening] [RFC PATCH 07/19] mixed: kref fixes Elena Reshetova
2016-12-29  6:56 ` [kernel-hardening] [RFC PATCH 08/19] kernel, mm: convert from atomic_t to refcount_t Elena Reshetova
2017-01-05  2:25   ` AKASHI Takahiro
2017-01-05  9:56     ` Reshetova, Elena
2017-01-05 19:33       ` Kees Cook
2017-01-10 11:57         ` Reshetova, Elena
2017-01-10 20:34           ` Kees Cook
2017-01-11  9:30             ` Reshetova, Elena
2017-01-11 21:42               ` Kees Cook
2017-01-11 22:55                 ` Kees Cook
2017-01-12  2:55                   ` Kees Cook
2017-01-12  8:02                     ` Reshetova, Elena
2017-01-12  5:11                   ` AKASHI Takahiro
2017-01-12  8:18                     ` Reshetova, Elena
2017-01-12  8:57                     ` Peter Zijlstra
2017-01-16 16:16                       ` Reshetova, Elena
2017-01-17 17:15                         ` Kees Cook
2017-01-17 17:44                           ` Reshetova, Elena
2017-01-17 17:50                             ` David Windsor
2017-01-18  8:41                               ` Reshetova, Elena
2017-01-18  9:03                                 ` gregkh
2017-01-18  9:14                                   ` Reshetova, Elena
2017-01-17 18:26                             ` gregkh
2017-01-12  7:57                   ` Reshetova, Elena
2017-01-12  7:54                 ` Reshetova, Elena
2016-12-29  6:56 ` [kernel-hardening] [RFC PATCH 09/19] net: " Elena Reshetova
2016-12-29  6:56 ` [kernel-hardening] [RFC PATCH 10/19] fs: " Elena Reshetova
2016-12-29  6:56 ` [kernel-hardening] [RFC PATCH 11/19] security: " Elena Reshetova
2016-12-29  6:56 ` [kernel-hardening] [RFC PATCH 12/19] sound: " Elena Reshetova
2016-12-29  6:56 ` [kernel-hardening] [RFC PATCH 13/19] ipc: covert " Elena Reshetova
2016-12-29  6:56 ` [kernel-hardening] [RFC PATCH 14/19] tools: convert " Elena Reshetova
2016-12-29  6:56 ` [kernel-hardening] [RFC PATCH 15/19] block: " Elena Reshetova
2016-12-29  6:56 ` [kernel-hardening] [RFC PATCH 16/19] drivers: net " Elena Reshetova
2016-12-29  6:56 ` [kernel-hardening] [RFC PATCH 17/19] drivers: misc " Elena Reshetova
2016-12-29  6:56 ` [kernel-hardening] [RFC PATCH 18/19] drivers: infiniband " Elena Reshetova

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170103132136.GV3107@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=arnd@arndb.de \
    --cc=dwindsor@gmail.com \
    --cc=ebiggers3@gmail.com \
    --cc=elena.reshetova@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=h.peter.anvin@intel.com \
    --cc=ishkamiel@gmail.com \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.