From: Mark Rutland <mark.rutland@arm.com> To: Fredrik Markstrom <fredrik.markstrom@gmail.com> Cc: linux-arm-kernel@lists.infradead.org, Russell King <linux@armlinux.org.uk>, Will Deacon <will.deacon@arm.com>, Chris Brandt <chris.brandt@renesas.com>, Nicolas Pitre <nico@linaro.org>, Ard Biesheuvel <ard.biesheuvel@linaro.org>, Arnd Bergmann <arnd@arndb.de>, Linus Walleij <linus.walleij@linaro.org>, Masahiro Yamada <yamada.masahiro@socionext.com>, Kees Cook <keescook@chromium.org>, Jonathan Austin <jonathan.austin@arm.com>, Zhaoxiu Zeng <zhaoxiu.zeng@gmail.com>, Michal Marek <mmarek@suse.com>, linux-kernel@vger.kernel.org, kristina.martsenko@arm.com Subject: Re: [PATCH v2] arm: Added support for getcpu() vDSO using TPIDRURW Date: Tue, 4 Oct 2016 18:07:41 +0100 [thread overview] Message-ID: <20161004170741.GC29008@leverpostej> (raw) In-Reply-To: <1475595363-4272-1-git-send-email-fredrik.markstrom@gmail.com> On Tue, Oct 04, 2016 at 05:35:33PM +0200, Fredrik Markstrom wrote: > This makes getcpu() ~1000 times faster, this is very useful when > implementing per-cpu buffers in userspace (to avoid cache line > bouncing). As an example lttng ust becomes ~30% faster. > > The patch will break applications using TPIDRURW (which is context switched > since commit 4780adeefd042482f624f5e0d577bf9cdcbb760 ("ARM: 7735/2: It looks like you dropped the leading 'a' from the commit ID. For everyone else's benefit, the full ID is: a4780adeefd042482f624f5e0d577bf9cdcbb760 Please note that arm64 has done similar for compat tasks since commit: d00a3810c16207d2 ("arm64: context-switch user tls register tpidr_el0 for compat tasks") > Preserve the user r/w register TPIDRURW on context switch and fork")) and > is therefore made configurable. As you note above, this is an ABI break and *will* break some existing applications. That's generally a no-go. This also leaves arm64's compat with the existing behaviour, differing from arm. I was under the impression that other mechanisms were being considered for fast userspace access to per-cpu data structures, e.g. restartable sequences. What is the state of those? Why is this better? If getcpu() specifically is necessary, is there no other way to implement it? > +notrace int __vdso_getcpu(unsigned int *cpup, unsigned int *nodep, > + struct getcpu_cache *tcache) > +{ > + unsigned long node_and_cpu; > + > + asm("mrc p15, 0, %0, c13, c0, 2\n" : "=r"(node_and_cpu)); > + > + if (nodep) > + *nodep = cpu_to_node(node_and_cpu >> 16); > + if (cpup) > + *cpup = node_and_cpu & 0xffffUL; Given this is directly user-accessible, this format is a de-facto ABI, even if it's not documented as such. Is this definitely the format you want long-term? Thanks, Mark.
WARNING: multiple messages have this Message-ID (diff)
From: mark.rutland@arm.com (Mark Rutland) To: linux-arm-kernel@lists.infradead.org Subject: [PATCH v2] arm: Added support for getcpu() vDSO using TPIDRURW Date: Tue, 4 Oct 2016 18:07:41 +0100 [thread overview] Message-ID: <20161004170741.GC29008@leverpostej> (raw) In-Reply-To: <1475595363-4272-1-git-send-email-fredrik.markstrom@gmail.com> On Tue, Oct 04, 2016 at 05:35:33PM +0200, Fredrik Markstrom wrote: > This makes getcpu() ~1000 times faster, this is very useful when > implementing per-cpu buffers in userspace (to avoid cache line > bouncing). As an example lttng ust becomes ~30% faster. > > The patch will break applications using TPIDRURW (which is context switched > since commit 4780adeefd042482f624f5e0d577bf9cdcbb760 ("ARM: 7735/2: It looks like you dropped the leading 'a' from the commit ID. For everyone else's benefit, the full ID is: a4780adeefd042482f624f5e0d577bf9cdcbb760 Please note that arm64 has done similar for compat tasks since commit: d00a3810c16207d2 ("arm64: context-switch user tls register tpidr_el0 for compat tasks") > Preserve the user r/w register TPIDRURW on context switch and fork")) and > is therefore made configurable. As you note above, this is an ABI break and *will* break some existing applications. That's generally a no-go. This also leaves arm64's compat with the existing behaviour, differing from arm. I was under the impression that other mechanisms were being considered for fast userspace access to per-cpu data structures, e.g. restartable sequences. What is the state of those? Why is this better? If getcpu() specifically is necessary, is there no other way to implement it? > +notrace int __vdso_getcpu(unsigned int *cpup, unsigned int *nodep, > + struct getcpu_cache *tcache) > +{ > + unsigned long node_and_cpu; > + > + asm("mrc p15, 0, %0, c13, c0, 2\n" : "=r"(node_and_cpu)); > + > + if (nodep) > + *nodep = cpu_to_node(node_and_cpu >> 16); > + if (cpup) > + *cpup = node_and_cpu & 0xffffUL; Given this is directly user-accessible, this format is a de-facto ABI, even if it's not documented as such. Is this definitely the format you want long-term? Thanks, Mark.
next prev parent reply other threads:[~2016-10-04 17:08 UTC|newest] Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top 2016-10-04 13:49 [PATCH] arm: Added support for getcpu() vDSO using TPIDRURW Fredrik Markstrom 2016-10-04 15:35 ` [PATCH v2] " Fredrik Markstrom 2016-10-04 15:35 ` Fredrik Markstrom 2016-10-04 17:07 ` Mark Rutland [this message] 2016-10-04 17:07 ` Mark Rutland 2016-10-05 12:25 ` Fredrik Markström 2016-10-05 12:25 ` Fredrik Markström 2016-10-05 16:39 ` Fredrik Markström 2016-10-05 16:39 ` Fredrik Markström 2016-10-05 17:48 ` Robin Murphy 2016-10-05 17:48 ` Robin Murphy 2016-10-05 19:53 ` Russell King - ARM Linux 2016-10-05 19:53 ` Russell King - ARM Linux [not found] ` <CAKdL+dSt+cBCpwW5q+VCQh+7XeKrnyJgfTsEsuo2nKoUr9ytxw@mail.gmail.com> 2016-10-10 15:29 ` Will Deacon 2016-10-10 15:29 ` Will Deacon 2016-10-10 16:15 ` Restartable Sequences benchmarks (was: Re: [PATCH v2] arm: Added support for getcpu() vDSO using TPIDRURW) Mathieu Desnoyers 2016-10-10 16:15 ` Mathieu Desnoyers [not found] ` <CAKdL+dQH=9C2aGf7ys5-vXM7pkdPYUQ8xYWLipwVbABOz09f1g@mail.gmail.com> 2016-10-05 20:44 ` [PATCH v2] arm: Added support for getcpu() vDSO using TPIDRURW Mark Rutland 2016-10-05 20:44 ` Mark Rutland 2016-10-05 21:01 ` Russell King - ARM Linux 2016-10-05 21:01 ` Russell King - ARM Linux 2016-10-05 21:47 ` Mark Rutland 2016-10-05 21:47 ` Mark Rutland 2016-10-05 21:37 ` Fredrik Markström 2016-10-05 21:37 ` Fredrik Markström 2016-10-05 20:12 ` Mark Rutland 2016-10-05 20:12 ` Mark Rutland
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20161004170741.GC29008@leverpostej \ --to=mark.rutland@arm.com \ --cc=ard.biesheuvel@linaro.org \ --cc=arnd@arndb.de \ --cc=chris.brandt@renesas.com \ --cc=fredrik.markstrom@gmail.com \ --cc=jonathan.austin@arm.com \ --cc=keescook@chromium.org \ --cc=kristina.martsenko@arm.com \ --cc=linus.walleij@linaro.org \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux@armlinux.org.uk \ --cc=mmarek@suse.com \ --cc=nico@linaro.org \ --cc=will.deacon@arm.com \ --cc=yamada.masahiro@socionext.com \ --cc=zhaoxiu.zeng@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.