linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Russell King <linux@arm.linux.org.uk>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-api <linux-api@vger.kernel.org>,
	Paul Turner <pjt@google.com>, Andrew Hunter <ahh@google.com>,
	Andi Kleen <andi@firstfloor.org>,
	Dave Watson <davejwatson@fb.com>, Chris Lameter <cl@linux.com>,
	Ben Maurer <bmaurer@fb.com>, rostedt <rostedt@goodmis.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Boqun Feng <boqun.feng@gmail.com>
Subject: Re: [RFC PATCH v7 1/7] Restartable sequences system call
Date: Wed, 10 Aug 2016 01:10:31 -0700	[thread overview]
Message-ID: <CALCETrUTpq6qX5SS170y_VtwiPc-SeR03bt-1Q98n+YMZMqopA@mail.gmail.com> (raw)
In-Reply-To: <656745027.6624.1470773200334.JavaMail.zimbra@efficios.com>

On Tue, Aug 9, 2016 at 1:06 PM, Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
> ----- On Aug 3, 2016, at 9:19 AM, Peter Zijlstra peterz@infradead.org wrote:
>

>>
>>> +#endif
>>>  /* CPU-specific state of this task */
>>>      struct thread_struct thread;
>>>  /*
>>> @@ -3387,4 +3392,67 @@ void cpufreq_add_update_util_hook(int cpu, struct
>>> update_util_data *data,
>>>  void cpufreq_remove_update_util_hook(int cpu);
>>>  #endif /* CONFIG_CPU_FREQ */
>>>
>>> +#ifdef CONFIG_RSEQ
>>> +static inline void rseq_set_notify_resume(struct task_struct *t)
>>> +{
>>> +    if (t->rseq)
>>> +            set_tsk_thread_flag(t, TIF_NOTIFY_RESUME);
>>> +}
>>
>> Maybe I missed it, but why do we want to hook into NOTIFY_RESUME and not
>> have our own TIF flag?
>
> The short answer is that used the same approach as Paul Turner's patchset. ;)
>
> Through a deeper look into this, the only times we set the flag is when
> preempting and delivering a signal to a thread that has registered to
> rseq.
>
> Upon return to user-space with the flag set, the performance difference
> between having our own flag and hopping into the NOTIFY_RESUME bandwagon
> is that we can skip the various tests in exit_to_usermode_loop()
> with our own flag, at the expense of crowding the thread flags even
> nearer to filling up 32 bits, which will at some point require extra
> tests on the fast-path.

I don't think we're anywhere near running out.  Several of those flags
can probably go away pretty easily, too.

>
> Thinking about it, one benchmark I have not done so far is to modify
> hackbench so it registers its threads with the rseq system call. We
> can then figure out whether reserving a flag for rseq is justified or
> not.
>
> Comparing 10 runs of hackbench registering its sender/receiver threads
> with unmodified hackbench: (hackbench -l 100000)
>
>     Configuration: 2 sockets * 8-core Intel(R) Xeon(R) CPU E5-2630 v3 @
>     2.40GHz (directly on hardware, hyperthreading disabled in BIOS, energy
>     saving disabled in BIOS, turboboost disabled in BIOS, cpuidle.off=1
>     kernel parameter), with a Linux v4.7 defconfig+localyesconfig,
>     restartable sequences series applied.
>
>                                     Avg. Time (s)    Std.dev. (s)
> Unmodified Hackbench                    40.5            0.1
> Rseq-Registered Hackbench Threads       40.4            0.1
>
> So initial results seems to indicate that adding the notify_resume
> handling upon preemption does not have noticeable effects on
> performance, so I don't consider it worthwhile to try optimizing
> it by reserving its own thread flag. Or perhaps am I missing something
> important here ?
>

I don't think so.  One benefit of using do_notify_resume would be less
arch code.

> Actually, we want copy_from_user() there. This executes upon
> resume to user-space, so we can take a page fault is needed, so
> no "inatomic" needed. I therefore suggest:

Running the code below via exit_to_usermode_loop...

>
> static bool rseq_get_rseq_cs(struct task_struct *t,
>                 void __user **start_ip,
>                 void __user **post_commit_ip,
>                 void __user **abort_ip)
> {
>         unsigned long ptr;
>         struct rseq_cs __user *urseq_cs;
>         struct rseq_cs rseq_cs;
>
>         if (__get_user(ptr, &t->rseq->rseq_cs))
>                 return false;
>         if (!ptr)
>                 return true;
> #ifdef CONFIG_COMPAT
>         if (in_compat_syscall()) {
>                 urseq_cs = compat_ptr((compat_uptr_t)ptr);
>                 if (copy_from_user(&rseq_cs, urseq_cs, sizeof(*rseq_cs)))
>                         return false;
>                 *start_ip = compat_ptr((compat_uptr_t)rseq_cs.start_ip);
>                 *post_commit_ip = compat_ptr((compat_uptr_t)rseq_cs.post_commit_ip);
>                 *abort_ip = compat_ptr((compat_uptr_t)rseq_cs.abort_ip);
>                 return true;
>         }
> #endif

...means that in_compat_syscall() is nonsense.  (It *works* there, but
I can't imagine that it does anything that is actually sensible for
this use.)

Can't you just define the ABI so that no compat junk is needed?
(Also, CRIU will thank you for doing that.)


>>> +SYSCALL_DEFINE2(rseq, struct rseq __user *, rseq, int, flags)
>>> +{
>>> +    if (unlikely(flags))
>>> +            return -EINVAL;
>>
>> (add whitespace)
>
> fixed.
>
>>
>>> +    if (!rseq) {
>>> +            if (!current->rseq)
>>> +                    return -ENOENT;
>>> +            return 0;
>>> +    }

This looks entirely wrong.  Setting rseq to NULL fails if it's already
NULL but silently does nothing if rseq is already set?  Surely it
should always succeed and it should actually do something if rseq is
set.


-- 
Andy Lutomirski
AMA Capital Management, LLC

  parent reply	other threads:[~2016-08-10 19:16 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-21 21:14 [RFC PATCH v7 0/7] Restartable sequences system call Mathieu Desnoyers
2016-07-21 21:14 ` [RFC PATCH v7 1/7] " Mathieu Desnoyers
2016-07-25 23:02   ` Andy Lutomirski
2016-07-26  3:02     ` Mathieu Desnoyers
2016-08-03 12:27       ` Peter Zijlstra
2016-08-03 16:37         ` Andy Lutomirski
2016-08-03 18:31           ` Christoph Lameter
2016-08-04  5:01             ` Andy Lutomirski
2016-08-04  4:27           ` Boqun Feng
2016-08-04  5:03             ` Andy Lutomirski
2016-08-09 16:13               ` Boqun Feng
2016-08-10  8:01                 ` Andy Lutomirski
2016-08-10 17:40                   ` Mathieu Desnoyers
2016-08-10 17:33                 ` Mathieu Desnoyers
2016-08-11  4:54                   ` Boqun Feng
2016-08-10  8:13               ` Andy Lutomirski
2016-08-03 18:29       ` Christoph Lameter
2016-08-10 16:47         ` Mathieu Desnoyers
2016-08-10 16:59           ` Christoph Lameter
2016-07-27 15:03   ` Boqun Feng
2016-07-27 15:05     ` [RFC 1/4] rseq/param_test: Convert test_data_entry::count to intptr_t Boqun Feng
2016-07-27 15:05       ` [RFC 2/4] Restartable sequences: powerpc architecture support Boqun Feng
2016-07-28  3:13         ` Mathieu Desnoyers
2016-07-27 15:05       ` [RFC 3/4] Restartable sequences: Wire up powerpc system call Boqun Feng
2016-07-28  3:13         ` Mathieu Desnoyers
2016-07-27 15:05       ` [RFC 4/4] Restartable sequences: Add self-tests for PPC Boqun Feng
2016-07-28  2:59         ` Mathieu Desnoyers
2016-07-28  4:43           ` Boqun Feng
2016-07-28  7:37             ` [RFC v2] " Boqun Feng
2016-07-28 14:04               ` Mathieu Desnoyers
2016-07-28 13:42             ` [RFC 4/4] " Mathieu Desnoyers
2016-07-28  3:07       ` [RFC 1/4] rseq/param_test: Convert test_data_entry::count to intptr_t Mathieu Desnoyers
2016-07-28  3:10     ` [RFC PATCH v7 1/7] Restartable sequences system call Mathieu Desnoyers
2016-08-03 13:19   ` Peter Zijlstra
2016-08-03 14:53     ` Paul E. McKenney
2016-08-03 15:45     ` Boqun Feng
2016-08-07 15:36       ` Mathieu Desnoyers
2016-08-07 23:35         ` Boqun Feng
2016-08-09 13:22           ` Mathieu Desnoyers
2016-08-09 20:06     ` Mathieu Desnoyers
2016-08-09 21:33       ` Peter Zijlstra
2016-08-09 22:41         ` Mathieu Desnoyers
2016-08-10  7:50           ` Peter Zijlstra
2016-08-10 13:26             ` Mathieu Desnoyers
2016-08-10 13:33               ` Peter Zijlstra
2016-08-10 14:04                 ` Mathieu Desnoyers
2016-08-10  8:10       ` Andy Lutomirski [this message]
2016-08-10 19:04         ` Mathieu Desnoyers
2016-08-10 19:16           ` Andy Lutomirski
2016-08-10 20:06             ` Mathieu Desnoyers
2016-08-10 20:09               ` Andy Lutomirski
2016-08-10 21:01                 ` Mathieu Desnoyers
2016-08-11  7:23                   ` Andy Lutomirski
2016-08-10  8:43       ` Peter Zijlstra
2016-08-10 13:57         ` Mathieu Desnoyers
2016-08-10 14:28           ` Peter Zijlstra
2016-08-10 14:44             ` Mathieu Desnoyers
2016-08-10 13:29       ` Peter Zijlstra
2016-07-21 21:14 ` [RFC PATCH v7 2/7] tracing: instrument restartable sequences Mathieu Desnoyers
2016-07-21 21:14 ` [RFC PATCH v7 3/7] Restartable sequences: ARM 32 architecture support Mathieu Desnoyers
2016-07-21 21:14 ` [RFC PATCH v7 4/7] Restartable sequences: wire up ARM 32 system call Mathieu Desnoyers
2016-07-21 21:14 ` [RFC PATCH v7 5/7] Restartable sequences: x86 32/64 architecture support Mathieu Desnoyers
2016-07-21 21:14 ` [RFC PATCH v7 6/7] Restartable sequences: wire up x86 32/64 system call Mathieu Desnoyers
2016-07-21 21:14 ` [RFC PATCH v7 7/7] Restartable sequences: self-tests Mathieu Desnoyers
     [not found]   ` <CO1PR15MB09822FC140F84DCEEF2004CDDD0B0@CO1PR15MB0982.namprd15.prod.outlook.com>
2016-07-24  3:09     ` Mathieu Desnoyers
2016-07-24 18:01       ` Dave Watson
2016-07-25 16:43         ` Mathieu Desnoyers
2016-08-11 23:26         ` Mathieu Desnoyers
2016-08-12  1:28           ` Boqun Feng
2016-08-12  3:10             ` Mathieu Desnoyers
2016-08-12  3:13               ` Mathieu Desnoyers
2016-08-12  5:30               ` Boqun Feng
2016-08-12 16:35                 ` Boqun Feng
2016-08-12 18:11                   ` Mathieu Desnoyers
2016-08-13  1:28                     ` Boqun Feng
2016-08-14 15:02                       ` Mathieu Desnoyers
2016-08-15  0:56                         ` Boqun Feng
2016-08-15 18:06                           ` Mathieu Desnoyers
2016-08-12 19:36           ` Mathieu Desnoyers
2016-08-12 20:05             ` Dave Watson
2016-08-14 17:09               ` Mathieu Desnoyers
2016-07-25 18:12     ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrUTpq6qX5SS170y_VtwiPc-SeR03bt-1Q98n+YMZMqopA@mail.gmail.com \
    --to=luto@amacapital.net \
    --cc=ahh@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=bmaurer@fb.com \
    --cc=boqun.feng@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=cl@linux.com \
    --cc=davejwatson@fb.com \
    --cc=hpa@zytor.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@redhat.com \
    --cc=mtk.manpages@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).