All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Russell King <linux@arm.linux.org.uk>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-api <linux-api@vger.kernel.org>,
	Paul Turner <pjt@google.com>, Andrew Hunter <ahh@google.com>,
	Andi Kleen <andi@firstfloor.org>,
	Dave Watson <davejwatson@fb.com>, Chris Lameter <cl@linux.com>,
	Ben Maurer <bmaurer@fb.com>, rostedt <rostedt@goodmis.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Boqun Feng <boqun.feng@gmail.com>
Subject: Re: [RFC PATCH v7 1/7] Restartable sequences system call
Date: Wed, 10 Aug 2016 21:01:22 +0000 (UTC)	[thread overview]
Message-ID: <1327322278.7807.1470862882633.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <CALCETrUrMG0zNxEcYR8cPjG0A9p7TCrLLQmLGqmRzuqVy=pXUQ@mail.gmail.com>

----- On Aug 10, 2016, at 4:09 PM, Andy Lutomirski luto@amacapital.net wrote:

> On Wed, Aug 10, 2016 at 1:06 PM, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

<snip>

>>> u64 is a perfectly valid, if odd, userspace pointer on all
>>> architecures that I know of, and it's certainly a valid userspace
>>> pointer on x86 32-bit userspace (the high bits will just all be zero).
>>> Can you just use u64?
>>
>> My concern is about a 32-bit user-space putting garbage rather than zeroes
>> (on purpose) to fool the kernel on those upper 32 bits. Doing
>>
>>   compat_ptr((compat_uptr_t)rseq_cs.start_ip)
>>
>> effectively ends up clearing the upper 32 bits.
>>
>> But since we only use those pointer values for comparisons, perhaps we
>> just don't care if a 32-bit userspace app try to shoot itself in
>> the foot by passing garbage upper 32 bits ?
>>
> 
> How is garbage in the high bits any different than garbage in any
> other bits in there?

It's not :)

> 
>>
>>> If this would be a performance problem on ARM, then maybe that's a
>>> reason to use compat helpers.
>>
>> We already use 64-bit values for the pointers, even on 32-bit. Normally
>> userspace just puts zeroes in the top bits. It's mostly a question of
>> clearing the top 32 bits or not when loading them in the kernel. If we
>> don't need to, then I can remove the compat code entirely, and we don't
>> care about user_64bit_mode() anymore, as you initially recommended.
>> Does it make sense ?
> 
> Yes, I think so.  I'd suggest just honoring all the bits.

OK, will do !

> 
>>
>>>
>>>>
>>>>>
>>>>>
>>>>>>>> +SYSCALL_DEFINE2(rseq, struct rseq __user *, rseq, int, flags)
>>>>>>>> +{
>>>>>>>> +    if (unlikely(flags))
>>>>>>>> +            return -EINVAL;
>>>>>>>
>>>>>>> (add whitespace)
>>>>>>
>>>>>> fixed.
>>>>>>
>>>>>>>
>>>>>>>> +    if (!rseq) {
>>>>>>>> +            if (!current->rseq)
>>>>>>>> +                    return -ENOENT;
>>>>>>>> +            return 0;
>>>>>>>> +    }
>>>>>
>>>>> This looks entirely wrong.  Setting rseq to NULL fails if it's already
>>>>> NULL but silently does nothing if rseq is already set?  Surely it
>>>>> should always succeed and it should actually do something if rseq is
>>>>> set.
>>>>
>>>> From the proposed rseq(2) manpage:
>>>>
>>>> "A NULL rseq value can be used to check whether rseq is registered
>>>> for the current thread."
>>>>
>>>> The implementation does just that: it returns -1, errno=ENOENT if no
>>>> rseq is currently registered, or 0 if rseq is currently registered.
>>>
>>> I think that's problematic.  Why can't you unregister an existing
>>> rseq?  If you can't, how is a thread supposed to clean up after
>>> itself?
>>>
>>
>> Unregistering an existing thread rseq would require that we keep reference
>> counting, in case multiple libs and/or the app are using rseq. I am
>> trying to keep things as simple as needed.
>>
>> If I understand your concern, the problematic scenario would be at
>> thread exit (this is my current approximate understanding of glibc
>> handling of library TLS variable reclaim at thread exit):
>>
>> thread exits in userspace:
>> - glibc frees its rseq TLS memory area (in case the TLS is in a library),
>> - thread preempted before really exiting,
>> - kernel reads/writes to freed TLS memory.
>>   - corruption may occur (e.g. memory re-allocated by another thread already)
>>
>> Am I getting it right ?
> 
> Yes.

Hrm, then we should:

- add a rseq_refcount field to the task struct,
- increment this refcount whenever rseq receives a registration, after
  ensuring that we are registering the same address as was previously
  requested by preceding registrations for the thread (except if the
  refcount was 0),
- When rseq receives a NULL address, decrement refcount. Set address to
  NULL when it reaches 0.

Doing the refcounting in kernel-space rather than user-space allows us to
keep both registration/unregistration and refcount atomic, which simplify
things if we plan to use rseq from signal handlers.

With current glibc, a library that would lazily register and use rseq
without knowledge of the application would then have to use pthread_key_create()
to set a destr_function to run at thread exit, which would take care of
unregistration.

We could add a RSEQ_FORCE_UNREGISTER flag to rseq flags to allow future
glibc versions to force unregistering rseq before freeing its TLS memory,
just in case a userspace library omits to unregister itself.

Thoughts ?

Thanks,

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2016-08-10 21:02 UTC|newest]

Thread overview: 130+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-21 21:14 [RFC PATCH v7 0/7] Restartable sequences system call Mathieu Desnoyers
2016-07-21 21:14 ` Mathieu Desnoyers
2016-07-21 21:14 ` [RFC PATCH v7 1/7] " Mathieu Desnoyers
2016-07-21 21:14   ` Mathieu Desnoyers
2016-07-25 23:02   ` Andy Lutomirski
2016-07-25 23:02     ` Andy Lutomirski
2016-07-26  3:02     ` Mathieu Desnoyers
2016-07-26  3:02       ` Mathieu Desnoyers
2016-08-03 12:27       ` Peter Zijlstra
2016-08-03 12:27         ` Peter Zijlstra
2016-08-03 16:37         ` Andy Lutomirski
2016-08-03 18:31           ` Christoph Lameter
2016-08-04  5:01             ` Andy Lutomirski
2016-08-04  5:01               ` Andy Lutomirski
2016-08-04  4:27           ` Boqun Feng
2016-08-04  4:27             ` Boqun Feng
2016-08-04  5:03             ` Andy Lutomirski
2016-08-09 16:13               ` Boqun Feng
2016-08-09 16:13                 ` Boqun Feng
2016-08-10  8:01                 ` Andy Lutomirski
2016-08-10 17:40                   ` Mathieu Desnoyers
2016-08-10 17:33                 ` Mathieu Desnoyers
2016-08-11  4:54                   ` Boqun Feng
2016-08-11  4:54                     ` Boqun Feng
2016-08-10  8:13               ` Andy Lutomirski
2016-08-10  8:13                 ` Andy Lutomirski
2016-08-03 18:29       ` Christoph Lameter
2016-08-03 18:29         ` Christoph Lameter
2016-08-10 16:47         ` Mathieu Desnoyers
2016-08-10 16:47           ` Mathieu Desnoyers
2016-08-10 16:59           ` Christoph Lameter
2016-07-27 15:03   ` Boqun Feng
2016-07-27 15:03     ` Boqun Feng
2016-07-27 15:05     ` [RFC 1/4] rseq/param_test: Convert test_data_entry::count to intptr_t Boqun Feng
2016-07-27 15:05       ` Boqun Feng
2016-07-27 15:05       ` [RFC 2/4] Restartable sequences: powerpc architecture support Boqun Feng
2016-07-28  3:13         ` Mathieu Desnoyers
2016-07-27 15:05       ` [RFC 3/4] Restartable sequences: Wire up powerpc system call Boqun Feng
2016-07-28  3:13         ` Mathieu Desnoyers
2016-07-28  3:13           ` Mathieu Desnoyers
2016-07-27 15:05       ` [RFC 4/4] Restartable sequences: Add self-tests for PPC Boqun Feng
2016-07-28  2:59         ` Mathieu Desnoyers
2016-07-28  4:43           ` Boqun Feng
2016-07-28  4:43             ` Boqun Feng
2016-07-28  7:37             ` [RFC v2] " Boqun Feng
2016-07-28 14:04               ` Mathieu Desnoyers
2016-07-28 14:04                 ` Mathieu Desnoyers
2016-07-28 13:42             ` [RFC 4/4] " Mathieu Desnoyers
2016-07-28 13:42               ` Mathieu Desnoyers
2016-07-28  3:07       ` [RFC 1/4] rseq/param_test: Convert test_data_entry::count to intptr_t Mathieu Desnoyers
2016-07-28  3:07         ` Mathieu Desnoyers
2016-07-28  3:10     ` [RFC PATCH v7 1/7] Restartable sequences system call Mathieu Desnoyers
2016-07-28  3:10       ` Mathieu Desnoyers
2016-08-03 13:19   ` Peter Zijlstra
2016-08-03 13:19     ` Peter Zijlstra
2016-08-03 14:53     ` Paul E. McKenney
2016-08-03 15:45     ` Boqun Feng
2016-08-03 15:45       ` Boqun Feng
2016-08-07 15:36       ` Mathieu Desnoyers
2016-08-07 23:35         ` Boqun Feng
2016-08-07 23:35           ` Boqun Feng
2016-08-09 13:22           ` Mathieu Desnoyers
2016-08-09 13:22             ` Mathieu Desnoyers
2016-08-09 20:06     ` Mathieu Desnoyers
2016-08-09 20:06       ` Mathieu Desnoyers
2016-08-09 21:33       ` Peter Zijlstra
2016-08-09 21:33         ` Peter Zijlstra
2016-08-09 22:41         ` Mathieu Desnoyers
2016-08-09 22:41           ` Mathieu Desnoyers
2016-08-10  7:50           ` Peter Zijlstra
2016-08-10 13:26             ` Mathieu Desnoyers
2016-08-10 13:33               ` Peter Zijlstra
2016-08-10 14:04                 ` Mathieu Desnoyers
2016-08-10  8:10       ` Andy Lutomirski
2016-08-10 19:04         ` Mathieu Desnoyers
2016-08-10 19:04           ` Mathieu Desnoyers
2016-08-10 19:16           ` Andy Lutomirski
2016-08-10 20:06             ` Mathieu Desnoyers
2016-08-10 20:09               ` Andy Lutomirski
2016-08-10 20:09                 ` Andy Lutomirski
2016-08-10 21:01                 ` Mathieu Desnoyers [this message]
2016-08-11  7:23                   ` Andy Lutomirski
2016-08-11  7:23                     ` Andy Lutomirski
2016-08-10  8:43       ` Peter Zijlstra
2016-08-10  8:43         ` Peter Zijlstra
2016-08-10 13:57         ` Mathieu Desnoyers
2016-08-10 14:28           ` Peter Zijlstra
2016-08-10 14:44             ` Mathieu Desnoyers
2016-08-10 13:29       ` Peter Zijlstra
2016-07-21 21:14 ` [RFC PATCH v7 2/7] tracing: instrument restartable sequences Mathieu Desnoyers
2016-07-21 21:14 ` [RFC PATCH v7 3/7] Restartable sequences: ARM 32 architecture support Mathieu Desnoyers
2016-07-21 21:14 ` [RFC PATCH v7 4/7] Restartable sequences: wire up ARM 32 system call Mathieu Desnoyers
2016-07-21 21:14   ` Mathieu Desnoyers
2016-07-21 21:14 ` [RFC PATCH v7 5/7] Restartable sequences: x86 32/64 architecture support Mathieu Desnoyers
2016-07-21 21:14   ` Mathieu Desnoyers
2016-07-21 21:14 ` [RFC PATCH v7 6/7] Restartable sequences: wire up x86 32/64 system call Mathieu Desnoyers
2016-07-21 21:14 ` [RFC PATCH v7 7/7] Restartable sequences: self-tests Mathieu Desnoyers
     [not found]   ` <CO1PR15MB09822FC140F84DCEEF2004CDDD0B0@CO1PR15MB0982.namprd15.prod.outlook.com>
2016-07-24  3:09     ` Mathieu Desnoyers
2016-07-24  3:09       ` Mathieu Desnoyers
2016-07-24 18:01       ` Dave Watson
2016-07-24 18:01         ` Dave Watson
2016-07-25 16:43         ` Mathieu Desnoyers
2016-07-25 16:43           ` Mathieu Desnoyers
2016-08-11 23:26         ` Mathieu Desnoyers
2016-08-12  1:28           ` Boqun Feng
2016-08-12  1:28             ` Boqun Feng
2016-08-12  3:10             ` Mathieu Desnoyers
2016-08-12  3:13               ` Mathieu Desnoyers
2016-08-12  3:13                 ` Mathieu Desnoyers
2016-08-12  5:30               ` Boqun Feng
2016-08-12  5:30                 ` Boqun Feng
2016-08-12 16:35                 ` Boqun Feng
2016-08-12 16:35                   ` Boqun Feng
2016-08-12 18:11                   ` Mathieu Desnoyers
2016-08-12 18:11                     ` Mathieu Desnoyers
2016-08-13  1:28                     ` Boqun Feng
2016-08-13  1:28                       ` Boqun Feng
2016-08-14 15:02                       ` Mathieu Desnoyers
2016-08-14 15:02                         ` Mathieu Desnoyers
2016-08-15  0:56                         ` Boqun Feng
2016-08-15  0:56                           ` Boqun Feng
2016-08-15 18:06                           ` Mathieu Desnoyers
2016-08-15 18:06                             ` Mathieu Desnoyers
2016-08-12 19:36           ` Mathieu Desnoyers
2016-08-12 19:36             ` Mathieu Desnoyers
2016-08-12 20:05             ` Dave Watson
2016-08-12 20:05               ` Dave Watson
2016-08-14 17:09               ` Mathieu Desnoyers
2016-08-14 17:09                 ` Mathieu Desnoyers
2016-07-25 18:12     ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1327322278.7807.1470862882633.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=ahh@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=bmaurer@fb.com \
    --cc=boqun.feng@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=cl@linux.com \
    --cc=davejwatson@fb.com \
    --cc=hpa@zytor.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=luto@amacapital.net \
    --cc=mingo@redhat.com \
    --cc=mtk.manpages@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.