linux-riscv.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Paul Walmsley <paul.walmsley@sifive.com>
Cc: "Paul Walmsley" <paul@pwsan.com>,
	"Björn Töpel" <bjorn.topel@gmail.com>,
	"Palmer Dabbelt" <palmer@sifive.com>,
	will.deacon@arm.com, catalin.marinas@arm.com,
	"Nick Kossifidis" <mick@ics.forth.gr>,
	"Christopher Lameter" <cl@linux.com>,
	linux-riscv@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: per-cpu thoughts
Date: Tue, 12 Mar 2019 11:23:50 +0000	[thread overview]
Message-ID: <20190312112349.GA35803@lakrids.cambridge.arm.com> (raw)
In-Reply-To: <alpine.DEB.2.21.9999.1903111110520.11892@viisi.sifive.com>

On Mon, Mar 11, 2019 at 11:39:56AM -0700, Paul Walmsley wrote:
> On Mon, 11 Mar 2019, Mark Rutland wrote:
> > On Mon, Mar 11, 2019 at 08:26:45AM -0700, Paul Walmsley wrote:
> > > On Mon, 11 Mar 2019, Björn Töpel wrote:
> > > > 
> > > > But the generic one disables interrupts, right?
> > > > 
> > > > I believe the rational for RV is similar to ARM's; AMO+preemption
> > > > disable regions is *slightly* better than the generic, but not as good
> > > > as the IA one. Or am I missing something?
> > > 
> > > There's been a discussion going on in a private thread about this that I 
> > > unfortunately didn't add you to.  The discussion is still ongoing, but I 
> > > think Christoph and myself and a few other folks have agreed that the 
> > > preempt_disable/enable is not needed for the amoadd approach.  This is 
> > > since the apparent intention of the preemption disable/enable is to ensure 
> > > the correctness of the counter increment; however there is no risk of 
> > > incorrectness in an amoadd sequence since the atomic add is locked across 
> > > all of the cache coherency domain. 
> > 
> > We also thought that initially, but there's a sbutle race that can
> > occur, and so we added code to disable preemption in commit:
> > 
> >   f3eab7184ddcd486 ("arm64: percpu: Make this_cpu accessors pre-empt safe")
> > 
> > The problem on arm64 is that our atomics take a single base register,
> > and we have to generate the percpu address with separate instructions
> > from the atomic itself. That means we can get preempted between address
> > generation and the atomic, which is problematic for sequences like:
> > 
> > 	// Thread-A			// Thread-B
> > 
> > 	this_cpu_add(var)
> > 					local_irq_disable(flags)
> > 					...
> > 					v = __this_cpu_read(var);
> > 					v = some_function(v);
> > 					__this_cpu_write(var, v);
> > 					...
> > 					local_irq_restore(flags)
> > 
> > ... which can unexpectedly race as:
> > 
> > 
> > 	// Thread-A			// Thread-B
> > 	
> > 	< generate CPU X addr >
> > 	< preempted >
> > 
> > 					< scheduled on CPU X >
> > 					local_irq_disable(flags);
> > 					v = __this_cpu_read(var);
> > 
> > 	< scheduled on CPU Y >
> > 	< add to CPU X's var >
> > 					v = some_function(v);
> > 					__this_cpu_write(var, v);
> > 					local_irq_restore(flags);
> > 
> > ... and hence we lose an update to a percpu variable.
> 
> Makes sense, and thank you very much for the detailed sequences.  
> Open-coded per-cpu code sequences would also cause RISC-V to be exposed to 
> the same issue.  Christoph mentioned this, but at the time my attention 
> was focused on per-cpu counters, and not the broader range of per-cpu code 
> sequences.
> 
> > I suspect RISC-V would have the same problem, unless its AMOs can
> > generate the percpu address and perform the update in a single
> > instruction.
> 
> My understanding is that many of Christoph's per-cpu performance concerns 
> revolve around counters in the VM code, such as:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/vmstat.c#n355

The mod_*_state() functions are the only ones which mess with
preemption, and that should only mandate a few locally-visible
modifications of preempt_count.

Similar cases apply within SLUB, and I'd hoped to improve that with my
this-cpu-reg branch, but I didn't see a measureable improvement on
workloads I tried.

Have you seen a measureable performance problem here?

> and probably elsewhere by now.  It may be worth creating a distinct API 
> for those counters.  If only increment, decrement, and read operations are 
> needed, there shouldn't be a need to disable or re-enable 
> preemption in those code paths - assuming that one is either able to 
> tolerate the occasional cache line bounce or retries in a long LL/SC 
> sequence.  Any opinions on that?

I'm afraid I don't understand this code well enough to say whether that
would be safe.

It's not clear to me whether there would be a measureable performance
difference, as I'd expect fiddling with preempt_count to be relatively
cheap. The AMOs themselves don't need to enforce ordering here, and only
a few compiler barriers are necessary.

Thanks,
Mark.

> > FWIW, I had a go at building percpu ops that didn't need to disable
> > preemption, but that required LL/SC atomics, reserving a GPR for the
> > percpu offset, and didn't result in a measurable difference in practice.
> > The patches are at:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=arm64/this-cpu-reg&id=84ee5f23f93d4a650e828f831da9ed29c54623c5
> 
> Very interesting indeed.  Thank you for sharing that,
> 
> - Paul


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2019-03-12 11:24 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-20 19:57 per-cpu thoughts Björn Töpel
2019-02-21 15:57 ` Christopher Lameter
2019-02-21 16:28 ` Paul Walmsley
2019-02-21 17:24   ` Björn Töpel
2019-02-21 17:49     ` Paul Walmsley
2019-02-21 19:40       ` Palmer Dabbelt
2019-02-22 15:04         ` Christopher Lameter
2019-02-22 15:36           ` Nick Kossifidis
2019-02-22 15:56             ` Christopher Lameter
2019-02-22 19:47               ` Björn Töpel
2019-02-22 19:56                 ` Christopher Lameter
2019-02-28 12:20                   ` Paul Walmsley
2019-02-28 17:58                     ` Christopher Lameter
2019-02-28 18:42                       ` Paul Walmsley
2019-02-28 19:09                         ` Christopher Lameter
2019-02-28 20:21                           ` Paul Walmsley
2019-03-01  1:13                             ` Christopher Lameter
2019-03-08  7:17                   ` Björn Töpel
2019-03-11 13:22                     ` Palmer Dabbelt
2019-03-11 14:48                       ` Björn Töpel
2019-03-11 14:56                         ` Christopher Lameter
2019-03-11 15:05                           ` Björn Töpel
2019-03-11 15:26                             ` Paul Walmsley
2019-03-11 16:48                               ` Mark Rutland
2019-03-11 18:39                                 ` Paul Walmsley
2019-03-12 11:23                                   ` Mark Rutland [this message]
2019-03-12 16:01                                     ` Paul Walmsley
2019-03-12 17:34                                       ` Christopher Lameter
2019-03-12  4:26                               ` Christopher Lameter
2019-03-12 14:21                                 ` Paul Walmsley
2019-03-12 17:42                                   ` Christopher Lameter
2019-03-12 17:59                                     ` Gary Guo
2019-03-13 18:58                                       ` Christopher Lameter
2019-03-13 20:15                                     ` Paul Walmsley
2019-03-22 14:51                               ` Nick Kossifidis
2019-03-22 17:57                                 ` Christopher Lameter
2019-03-11 15:51                             ` Christopher Lameter
2019-03-11 16:35                               ` Björn Töpel
2019-03-12  4:22                                 ` Christopher Lameter
2019-02-22 19:48             ` Björn Töpel
2019-02-22 20:53               ` Nick Kossifidis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190312112349.GA35803@lakrids.cambridge.arm.com \
    --to=mark.rutland@arm.com \
    --cc=bjorn.topel@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=cl@linux.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=mick@ics.forth.gr \
    --cc=palmer@sifive.com \
    --cc=paul.walmsley@sifive.com \
    --cc=paul@pwsan.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).