All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Fredrik Markstrom <fredrik.markstrom@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org,
	Russell King <linux@armlinux.org.uk>,
	Will Deacon <will.deacon@arm.com>,
	Chris Brandt <chris.brandt@renesas.com>,
	Nicolas Pitre <nico@linaro.org>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Arnd Bergmann <arnd@arndb.de>,
	Linus Walleij <linus.walleij@linaro.org>,
	Masahiro Yamada <yamada.masahiro@socionext.com>,
	Kees Cook <keescook@chromium.org>,
	Jonathan Austin <jonathan.austin@arm.com>,
	Zhaoxiu Zeng <zhaoxiu.zeng@gmail.com>,
	Michal Marek <mmarek@suse.com>,
	linux-kernel@vger.kernel.org, kristina.martsenko@arm.com
Subject: Re: [PATCH v2] arm: Added support for getcpu() vDSO using TPIDRURW
Date: Tue, 4 Oct 2016 18:07:41 +0100	[thread overview]
Message-ID: <20161004170741.GC29008@leverpostej> (raw)
In-Reply-To: <1475595363-4272-1-git-send-email-fredrik.markstrom@gmail.com>

On Tue, Oct 04, 2016 at 05:35:33PM +0200, Fredrik Markstrom wrote:
> This makes getcpu() ~1000 times faster, this is very useful when
> implementing per-cpu buffers in userspace (to avoid cache line
> bouncing). As an example lttng ust becomes ~30% faster.
> 
> The patch will break applications using TPIDRURW (which is context switched
> since commit 4780adeefd042482f624f5e0d577bf9cdcbb760 ("ARM: 7735/2:

It looks like you dropped the leading 'a' from the commit ID. For
everyone else's benefit, the full ID is:

  a4780adeefd042482f624f5e0d577bf9cdcbb760

Please note that arm64 has done similar for compat tasks since commit:

  d00a3810c16207d2 ("arm64: context-switch user tls register tpidr_el0 for
  compat tasks")

> Preserve the user r/w register TPIDRURW on context switch and fork")) and
> is therefore made configurable.

As you note above, this is an ABI break and *will* break some existing
applications. That's generally a no-go.

This also leaves arm64's compat with the existing behaviour, differing
from arm.

I was under the impression that other mechanisms were being considered
for fast userspace access to per-cpu data structures, e.g. restartable
sequences. What is the state of those? Why is this better?

If getcpu() specifically is necessary, is there no other way to
implement it?

> +notrace int __vdso_getcpu(unsigned int *cpup, unsigned int *nodep,
> +			  struct getcpu_cache *tcache)
> +{
> +	unsigned long node_and_cpu;
> +
> +	asm("mrc p15, 0, %0, c13, c0, 2\n" : "=r"(node_and_cpu));
> +
> +	if (nodep)
> +		*nodep = cpu_to_node(node_and_cpu >> 16);
> +	if (cpup)
> +		*cpup  = node_and_cpu & 0xffffUL;

Given this is directly user-accessible, this format is a de-facto ABI,
even if it's not documented as such. Is this definitely the format you
want long-term?

Thanks,
Mark.

WARNING: multiple messages have this Message-ID (diff)
From: mark.rutland@arm.com (Mark Rutland)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2] arm: Added support for getcpu() vDSO using TPIDRURW
Date: Tue, 4 Oct 2016 18:07:41 +0100	[thread overview]
Message-ID: <20161004170741.GC29008@leverpostej> (raw)
In-Reply-To: <1475595363-4272-1-git-send-email-fredrik.markstrom@gmail.com>

On Tue, Oct 04, 2016 at 05:35:33PM +0200, Fredrik Markstrom wrote:
> This makes getcpu() ~1000 times faster, this is very useful when
> implementing per-cpu buffers in userspace (to avoid cache line
> bouncing). As an example lttng ust becomes ~30% faster.
> 
> The patch will break applications using TPIDRURW (which is context switched
> since commit 4780adeefd042482f624f5e0d577bf9cdcbb760 ("ARM: 7735/2:

It looks like you dropped the leading 'a' from the commit ID. For
everyone else's benefit, the full ID is:

  a4780adeefd042482f624f5e0d577bf9cdcbb760

Please note that arm64 has done similar for compat tasks since commit:

  d00a3810c16207d2 ("arm64: context-switch user tls register tpidr_el0 for
  compat tasks")

> Preserve the user r/w register TPIDRURW on context switch and fork")) and
> is therefore made configurable.

As you note above, this is an ABI break and *will* break some existing
applications. That's generally a no-go.

This also leaves arm64's compat with the existing behaviour, differing
from arm.

I was under the impression that other mechanisms were being considered
for fast userspace access to per-cpu data structures, e.g. restartable
sequences. What is the state of those? Why is this better?

If getcpu() specifically is necessary, is there no other way to
implement it?

> +notrace int __vdso_getcpu(unsigned int *cpup, unsigned int *nodep,
> +			  struct getcpu_cache *tcache)
> +{
> +	unsigned long node_and_cpu;
> +
> +	asm("mrc p15, 0, %0, c13, c0, 2\n" : "=r"(node_and_cpu));
> +
> +	if (nodep)
> +		*nodep = cpu_to_node(node_and_cpu >> 16);
> +	if (cpup)
> +		*cpup  = node_and_cpu & 0xffffUL;

Given this is directly user-accessible, this format is a de-facto ABI,
even if it's not documented as such. Is this definitely the format you
want long-term?

Thanks,
Mark.

  reply	other threads:[~2016-10-04 17:08 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-04 13:49 [PATCH] arm: Added support for getcpu() vDSO using TPIDRURW Fredrik Markstrom
2016-10-04 15:35 ` [PATCH v2] " Fredrik Markstrom
2016-10-04 15:35   ` Fredrik Markstrom
2016-10-04 17:07   ` Mark Rutland [this message]
2016-10-04 17:07     ` Mark Rutland
2016-10-05 12:25     ` Fredrik Markström
2016-10-05 12:25       ` Fredrik Markström
2016-10-05 16:39       ` Fredrik Markström
2016-10-05 16:39         ` Fredrik Markström
2016-10-05 17:48         ` Robin Murphy
2016-10-05 17:48           ` Robin Murphy
2016-10-05 19:53           ` Russell King - ARM Linux
2016-10-05 19:53             ` Russell King - ARM Linux
     [not found]             ` <CAKdL+dSt+cBCpwW5q+VCQh+7XeKrnyJgfTsEsuo2nKoUr9ytxw@mail.gmail.com>
2016-10-10 15:29               ` Will Deacon
2016-10-10 15:29                 ` Will Deacon
2016-10-10 16:15                 ` Restartable Sequences benchmarks (was: Re: [PATCH v2] arm: Added support for getcpu() vDSO using TPIDRURW) Mathieu Desnoyers
2016-10-10 16:15                   ` Mathieu Desnoyers
     [not found]           ` <CAKdL+dQH=9C2aGf7ys5-vXM7pkdPYUQ8xYWLipwVbABOz09f1g@mail.gmail.com>
2016-10-05 20:44             ` [PATCH v2] arm: Added support for getcpu() vDSO using TPIDRURW Mark Rutland
2016-10-05 20:44               ` Mark Rutland
2016-10-05 21:01               ` Russell King - ARM Linux
2016-10-05 21:01                 ` Russell King - ARM Linux
2016-10-05 21:47                 ` Mark Rutland
2016-10-05 21:47                   ` Mark Rutland
2016-10-05 21:37               ` Fredrik Markström
2016-10-05 21:37                 ` Fredrik Markström
2016-10-05 20:12       ` Mark Rutland
2016-10-05 20:12         ` Mark Rutland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161004170741.GC29008@leverpostej \
    --to=mark.rutland@arm.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=arnd@arndb.de \
    --cc=chris.brandt@renesas.com \
    --cc=fredrik.markstrom@gmail.com \
    --cc=jonathan.austin@arm.com \
    --cc=keescook@chromium.org \
    --cc=kristina.martsenko@arm.com \
    --cc=linus.walleij@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=mmarek@suse.com \
    --cc=nico@linaro.org \
    --cc=will.deacon@arm.com \
    --cc=yamada.masahiro@socionext.com \
    --cc=zhaoxiu.zeng@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.