All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Boqun Feng <boqun.feng@gmail.com>, Andrew Hunter <ahh@google.com>,
	maged michael <maged.michael@gmail.com>,
	gromer <gromer@google.com>, Avi Kivity <avi@scylladb.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Dave Watson <davejwatson@fb.com>,
	Andy Lutomirski <luto@kernel.org>,
	Will Deacon <will.deacon@arm.com>, Hans Boehm <hboehm@google.com>
Subject: Re: [PATCH v2] membarrier: provide register sync core cmd
Date: Mon, 28 Aug 2017 03:05:46 +0000 (UTC)	[thread overview]
Message-ID: <1463521395.16945.1503889546934.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <A2504E20-BD1F-4B8D-9497-26A64659C2D8@amacapital.net>

----- On Aug 27, 2017, at 3:53 PM, Andy Lutomirski luto@amacapital.net wrote:

>> On Aug 27, 2017, at 1:50 PM, Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> wrote:
>> 
>> Add a new MEMBARRIER_CMD_REGISTER_SYNC_CORE command to the membarrier
>> system call. It allows processes to register their intent to have their
>> threads issue core serializing barriers in addition to memory barriers
>> whenever a membarrier command is performed.
>> 
> 
> Why is this stateful?  That is, why not just have a new membarrier command to
> sync every thread's icache?

If we'd do it on every CPU icache, it would be as trivial as you say. The
concern here is sending IPIs only to CPUs running threads that belong to the
same process, so we don't disturb unrelated processes.

If we could just grab each CPU's runqueue lock, it would be fairly simple
to do. But we want to avoid hitting each runqueue with exclusive atomic
access associated with grabbing the lock. (cache-line bouncing)

So, the "private" membarrier command end up reading the rq->curr->mm pointer
value for each runqueue, and compare them to its own current->mm value.
However, this means that whenever we skip a CPU, we're not sending an
IPI to that CPU. So we rely on the scheduler for providing the required
full barrier either before storing to rq->curr, after user-space memory
accesses performed by "prev", as well as after storing to rq->curr,
before user-space memory accesses performed by "next".

The IPI of the private membarrier can issue issue both smp_mb()
and sync_core() (that's what my implementation does).
However, having sys_membarrier issue core serializing barriers adds
extra constraints on entry into the scheduler/resuming to user-space.
It's not sufficient to order user-space memory accesses wrt storing
to rq->curr;  we also want to serialize the core execution. This
is why I'm adding sync_core before the full barrier on entry, and
sync_core after the full barrier on exit. Arguably, some architectures
may not need the extra sync_core on exit (e.g. x86 has iret which
implies core serialization), there are cases where it's not guaranteed
(AFAIK sysexit), and it's rarely guaranteed on entry.

So, we end up with the possibility of adding the core serialization
unconditionally on entry and exit of scheduler. However, as my numbers
below show, performance is slightly impacted in heavy benchmarks.
Therefore, I propose to make processes register their intent to
have the scheduler issue core serializing barriers on their behalf
when it schedules them out/in.

>> * Scheduler Overhead Benchmarks
>> 
>> Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
>> taskset 01 ./perf bench sched pipe -T
>> Linux v4.13-rc6
>> 
>>                       Avg. usecs/op         Std.Dev. usecs/op
>> Before this change:         2.75                   0.12
>> Non-registered processes:   2.73                   0.08
>> Registered processes:       3.07                   0.02
>> 

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2017-08-28  3:04 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-27 20:50 [PATCH v2] membarrier: provide register sync core cmd Mathieu Desnoyers
2017-08-27 22:53 ` Andy Lutomirski
2017-08-28  3:05   ` Mathieu Desnoyers [this message]
2017-08-30  5:01     ` Andy Lutomirski
2017-08-30 14:46       ` Paul E. McKenney
2017-08-30 17:33         ` Mathieu Desnoyers
2017-08-31 17:00     ` Will Deacon
2017-08-30  4:01 ` Boqun Feng
2017-08-30 17:25   ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1463521395.16945.1503889546934.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=ahh@google.com \
    --cc=avi@scylladb.com \
    --cc=benh@kernel.crashing.org \
    --cc=boqun.feng@gmail.com \
    --cc=davejwatson@fb.com \
    --cc=gromer@google.com \
    --cc=hboehm@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=maged.michael@gmail.com \
    --cc=mpe@ellerman.id.au \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.