All of lore.kernel.org
 help / color / mirror / Atom feed
* Thread pointer changes
       [not found] ` <20140611145533.GT179@brightrain.aerifal.cx>
@ 2014-06-27 19:27   ` Andy Lutomirski
  2014-06-27 20:09     ` Russell King - ARM Linux
                       ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Andy Lutomirski @ 2014-06-27 19:27 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/11/2014 07:55 AM, Rich Felker wrote:
> On Tue, Jun 10, 2014 at 03:28:35AM -0400, Rich Felker wrote:
>> For ARM, I think we should revisit the thread-pointer/atomic inlining
>> work that was done as a sloppy workaround for kernels without the
>> kuser_helper page. If the set_thread_area syscall fails (due to an old
>> kernel that doesn't support it), we can setup a function pointer for
>> the __aeabi_read_tp function that only supports a single main thread
>> and returns its thread pointer. Likewise at this stage we could detect
>> the presence or absence of the kuser_helper page and substitute our
>> own fallbacks (using the instructions directly) if needed. One thing
>> that should be checked though is whether there are any kernel versions
>> which support EABI syscalls but not the thread-pointer setup syscall.
>> If not, there's really no use in having a fallback for that. These
>> slides look like they might shed some light on the history:
>> http://wookware.org/talks/armeabidebconf.pdf
> 
> According to those slides, the EABI-form syscall convention (which
> musl requires) was not added to Linux until 2.6.15. The kuser helpers
> (just thread pointer and atomic cas) were originally added in 2.6.12.
> Since we already have a bigger reason not to support pre-2.6.15
> kernels on ARM, there's no need to worry about whether TLS is
> supported by the kernel; we can just assume __set_thread_area will
> always work.
> 
> As for whether the kuser helper page exists, there are three cases:
> 
> 1. Normal kuser page - everything available and works.
> 
> 2. Mainline kernel CONFIG_KUSER_HELPERS disabled - the page is present
>    but filled with a HCF instruction. This is easy to detect.
> 
> 3. Hardened grsec kernels - the page is completely missing, so
>    attempts to access it fault. And worse yet, most syscalls report
>    EFAULT even if it's present because it's above the task size
>    address limit (in the kernel address range). So far the only way
>    I've found to detect it is process_vm_readv which was not added
>    until 3.2; I'm not sure if there are relevant pre-3.2 grsec kernels
>    with kuser helpers disabled.

Hi ARM people and Kees-

The "vectors" page appears to be an abomination that's a lot like the
x86_64 vsyscall page.  IMO it should be phased out.  I'm not going to do
it (I don't know enough about ARM, and I'm not really able to test),
but, having gone through cleaning up the vsyscall mess a couple of years
ago, I'd be happy to help if anyone is interested in doing it.

I'd suggest:

Step 1: Add an auxvec entry ASAP indicating the address of the vectors
page if present.  Possibly give some other positive indication if the
vectors page is *not* present, too.

Step 2: Add a config option, off by default, to make the vectors page be
a normal VMA.  Use _install_special_mapping for it.  See 3.16-rc2 on x86
for a very simple example.  arm/kernel/process.c has code for this, too,
but x86's is nicer (no arch_vma_name crap).  Embedded things (and
Chromium?) can enable this.

Step 3: Implement an emulated vectors page, just like x86_64 uses for
vsyscalls now.  This is conceptually simple, but it's a royal PITA for a
few reasons that I can do into detail about (and help fix!).

Step 4: Eventually convert ARM to use a vDSO instead.  Get rid of
sigpage and the "vectors" page.  Preserve compatibility by updating the
auxvec interface.  Provide both AT_SYSINFO_EHDR and AT_VDSO_FINDSYM
(which is a candidate interface that I might try to push for 3.17).

If anyone does this, merging it with the fancy new x86 vdso code would
probably be worthwhile.

The end game would be that systems with new kernels but old userspace
still work with degraded performance.  New kernels and new userspace are
quite happy.  New userspace on old kernels won't use the vectors page.


Thoughts?  Any takers?

--Andy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Thread pointer changes
  2014-06-27 19:27   ` Thread pointer changes Andy Lutomirski
@ 2014-06-27 20:09     ` Russell King - ARM Linux
  2014-06-27 21:09       ` [musl] " Szabolcs Nagy
  2014-06-27 21:37     ` Rich Felker
  2014-06-27 23:20     ` Russell King - ARM Linux
  2 siblings, 1 reply; 30+ messages in thread
From: Russell King - ARM Linux @ 2014-06-27 20:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 12:27:45PM -0700, Andy Lutomirski wrote:
> The "vectors" page appears to be an abomination that's a lot like the
> x86_64 vsyscall page.  IMO it should be phased out.  I'm not going to do
> it (I don't know enough about ARM, and I'm not really able to test),
> but, having gone through cleaning up the vsyscall mess a couple of years
> ago, I'd be happy to help if anyone is interested in doing it.

It's cleaned up as much as is possible to do - I personally put the
work into that last Summer, so I'm fully aware of all the issues, and
where we are now with it is where we're going to have to stay for at
least the next five to ten years as changing it any further breaks
binary compatibility or severely impacts performance.

There are options:

1. The page is present (which is the default setting) with kuser helpers
   present.

2. The page is not accessible, which must only be set where the guy
   configuring the kernel is sure that the userspace does not need the
   kuser page.  This point is clearly made in the help text for
   CONFIG_KUSER_HELPERS.

There isn't actually a third combination (page accessible but no kuser
helpers) or if there is, that's something someone else has added which
is not part of the mainline kernel.

One thing which did change last Summer is that various bits of kernel
code were moved out of the vectors page, code which user space has no
business poking its nose at (and using it to discover some useful
kernel addresses) and to also poison the vectors page to mitigate
against randomly jumping into the page at non-ABI locations.  Neither
of those two changes should affect any legitimate userspace application.

As I said above, enabling CONFIG_KUSER_HELPERS is known to be an ABI
break, and it's well documented as such.  The future will be running
systems without the kuser helpers because on ARMv6 and later, there's
little point in having them.  In fact, today almost all C libraries
are built without needing the kuser helpers.

As the future will be no kuser helper page, there's no point in trying
to turn it into a VDSO page - that presents many challenges as it's
not as trivial as you think.  Part of the problem that the kuser helper
page addresses is how to deal with per-thread data on early CPUs without
having any per-thread registers to store it.  That requires uniprocessor
and the kernel to poke the thread pointer into the page - and userspace
then needs to be able to access it reliably.

Due to that, any ARMv5 or earlier CPU will always have the kuser helper
page.  ARMv6 and later may or may not have the kuser helper page, but
there you're really building for a different ABI anyway (VFP-based) and
you also know that you have the thread registers.

The final point to make is that only the C library should really be
concerned about this page, not applications.  Applications should not
be making direct calls into this page.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 20:09     ` Russell King - ARM Linux
@ 2014-06-27 21:09       ` Szabolcs Nagy
  2014-06-27 21:30         ` Russell King - ARM Linux
  0 siblings, 1 reply; 30+ messages in thread
From: Szabolcs Nagy @ 2014-06-27 21:09 UTC (permalink / raw)
  To: linux-arm-kernel

* Russell King - ARM Linux <linux@arm.linux.org.uk> [2014-06-27 21:09:49 +0100]:
> As I said above, enabling CONFIG_KUSER_HELPERS is known to be an ABI
> break, and it's well documented as such.  The future will be running
> systems without the kuser helpers because on ARMv6 and later, there's
> little point in having them.  In fact, today almost all C libraries
> are built without needing the kuser helpers.

i thought the helpers in the kernel can avoid certain memory
barriers that the userspace has to do on armv6 for atomics
(and those barriers are deprecated on armv7 so i thought the
kuser page was better for portable binaries)

> Due to that, any ARMv5 or earlier CPU will always have the kuser helper
> page.  ARMv6 and later may or may not have the kuser helper page, but
> there you're really building for a different ABI anyway (VFP-based) and
> you also know that you have the thread registers.

so is it expected that the libc makes no attempt to provide
portable binary interface for armv5 and armv6?

ie should musl treat those as different targets?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 21:09       ` [musl] " Szabolcs Nagy
@ 2014-06-27 21:30         ` Russell King - ARM Linux
  2014-06-27 21:47           ` Andy Lutomirski
  2014-06-27 21:55           ` Rich Felker
  0 siblings, 2 replies; 30+ messages in thread
From: Russell King - ARM Linux @ 2014-06-27 21:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 11:09:31PM +0200, Szabolcs Nagy wrote:
> i thought the helpers in the kernel can avoid certain memory
> barriers that the userspace has to do on armv6 for atomics
> (and those barriers are deprecated on armv7 so i thought the
> kuser page was better for portable binaries)

The helpers are provided so that libc can be independent of the CPU
facilities in the machine.  The key word there is _libc_, not
applications.

So, a libc can be built to support the lowest architecture that
someone deems to support, and it may make use of the kuser helpers.
If it does, then you have a libc which requires that the kuser
helpers are always provided by the kernel, and the KUSER_HELPERS
option must never be disabled.  If it is disabled, then the libc
will be useless against that kernel.

However, a libc built against modern architectures should not be
making use of the kuser helpers.  We found last year that the
Ubuntu 12.04 glibc did still make use of one kuser helper, and
as such Ubuntu 12.04 also needs KUSER_HELPERS to remain enabled.

The last combination is that the libc is built for modern architectures
without needing any kuser helpers at all.  In this case - and only this
case - the kernel's KUSER_HELPERS option can be disabled should the
system integrator want to increase security.

> > Due to that, any ARMv5 or earlier CPU will always have the kuser helper
> > page.  ARMv6 and later may or may not have the kuser helper page, but
> > there you're really building for a different ABI anyway (VFP-based) and
> > you also know that you have the thread registers.
> 
> so is it expected that the libc makes no attempt to provide
> portable binary interface for armv5 and armv6?

The libc interface that applications make use should not have any
dependence on whether KUSER_HELPERS is enabled or disabled, the
presence of that page should be totally invisible to applications.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 19:27   ` Thread pointer changes Andy Lutomirski
  2014-06-27 20:09     ` Russell King - ARM Linux
@ 2014-06-27 21:37     ` Rich Felker
  2014-06-27 22:04       ` Russell King - ARM Linux
  2014-06-27 23:20     ` Russell King - ARM Linux
  2 siblings, 1 reply; 30+ messages in thread
From: Rich Felker @ 2014-06-27 21:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 12:27:45PM -0700, Andy Lutomirski wrote:
> Hi ARM people and Kees-
> 
> The "vectors" page appears to be an abomination that's a lot like the
> x86_64 vsyscall page.  IMO it should be phased out.

I'm not a fan of this extreme approach, but if it's taken, there needs
to be some way to continue to make universal binaries which work
safely on:

- Pre-v6, v6, and v7+ hardware.

- Pre-removal and post-removal kernels.

> Step 1: Add an auxvec entry ASAP indicating the address of the vectors
> page if present.  Possibly give some other positive indication if the
> vectors page is *not* present, too.

There should definitely be a positive indication of the absence of the
vectors page if it's removed, and it would also be nice to transition
to having the address non-fixed. What about reusing AT_SYSINFO with:

- AT_SYSINFO undefined having the current meaning: kuser page at the
  legacy fixed location.

- AT_SYSINFO defined as (void *)-1: kuser page disabled; this
  inherently indicates a cpu that supports TLS register and
  ldrex/strex and dmb.

- AT_SYSINFO defined as (void *)-2: kuser page disabled; this
  inherently indicates a cpu that supports TLS register and
  ldrex/strex but requires the old mcr-based barrier.

- AT_SYSINFO defined as any other value: the definition is the base
  address of the "vector page" (kuser helpers).

Disabling the vector page should not be possible at all on pre-v6, but
if you really want to make that possible, the atomic CAS syscall needs
to be made into a public API so we can make a syscall for CAS...

If the multiple values for "kuser page disabled" are deemed to be too
much of a hack, the equivalent information should be encoded in
AT_HWCAP. It's rather ridiculous how much useless information is in
AT_HWCAP while the most useful pieces of information -- whether
hardware atomics work -- is missing. HWCAP_TLS is an approximation for
this, but it's not the same; there are some corner-case CPUs that have
the one but not the other. And it also doesn't address the case where
the kernel traps and emulates the features (which is probably horribly
undesirable from a performance standpoint, but would be a viable
configuration alternative to having the vector page).

> Step 2: Add a config option, off by default, to make the vectors page be
> a normal VMA.  Use _install_special_mapping for it.  See 3.16-rc2 on x86
> for a very simple example.  arm/kernel/process.c has code for this, too,
> but x86's is nicer (no arch_vma_name crap).  Embedded things (and
> Chromium?) can enable this.
> 
> Step 3: Implement an emulated vectors page, just like x86_64 uses for
> vsyscalls now.  This is conceptually simple, but it's a royal PITA for a
> few reasons that I can do into detail about (and help fix!).

What is the difference between an "emulated vectors page" and the
real one?

> Step 4: Eventually convert ARM to use a vDSO instead.  Get rid of
> sigpage and the "vectors" page.  Preserve compatibility by updating the
> auxvec interface.  Provide both AT_SYSINFO_EHDR and AT_VDSO_FINDSYM
> (which is a candidate interface that I might try to push for 3.17).

This would also be okay.

> If anyone does this, merging it with the fancy new x86 vdso code would
> probably be worthwhile.
> 
> The end game would be that systems with new kernels but old userspace
> still work with degraded performance.  New kernels and new userspace are
> quite happy.  New userspace on old kernels won't use the vectors page.

New userspace should be able to run on old kernels. There are plenty
of devices out there where the kernel is not upgradable (e.g. due to
proprietary drivers, or abandoned free ones, that don't work on new
kernels), and one of the big usage cases of musl is to be able to make
static-linked binaries that you can put on such devices to extend
them.

Rich

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 21:30         ` Russell King - ARM Linux
@ 2014-06-27 21:47           ` Andy Lutomirski
  2014-06-27 21:58             ` Rich Felker
  2014-06-27 21:55           ` Rich Felker
  1 sibling, 1 reply; 30+ messages in thread
From: Andy Lutomirski @ 2014-06-27 21:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 2:30 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Jun 27, 2014 at 11:09:31PM +0200, Szabolcs Nagy wrote:
>> i thought the helpers in the kernel can avoid certain memory
>> barriers that the userspace has to do on armv6 for atomics
>> (and those barriers are deprecated on armv7 so i thought the
>> kuser page was better for portable binaries)
>
> The helpers are provided so that libc can be independent of the CPU
> facilities in the machine.  The key word there is _libc_, not
> applications.
>
> So, a libc can be built to support the lowest architecture that
> someone deems to support, and it may make use of the kuser helpers.
> If it does, then you have a libc which requires that the kuser
> helpers are always provided by the kernel, and the KUSER_HELPERS
> option must never be disabled.  If it is disabled, then the libc
> will be useless against that kernel.
>
> However, a libc built against modern architectures should not be
> making use of the kuser helpers.  We found last year that the
> Ubuntu 12.04 glibc did still make use of one kuser helper, and
> as such Ubuntu 12.04 also needs KUSER_HELPERS to remain enabled.
>
> The last combination is that the libc is built for modern architectures
> without needing any kuser helpers at all.  In this case - and only this
> case - the kernel's KUSER_HELPERS option can be disabled should the
> system integrator want to increase security.
>
>> > Due to that, any ARMv5 or earlier CPU will always have the kuser helper
>> > page.  ARMv6 and later may or may not have the kuser helper page, but
>> > there you're really building for a different ABI anyway (VFP-based) and
>> > you also know that you have the thread registers.
>>
>> so is it expected that the libc makes no attempt to provide
>> portable binary interface for armv5 and armv6?
>
> The libc interface that applications make use should not have any
> dependence on whether KUSER_HELPERS is enabled or disabled, the
> presence of that page should be totally invisible to applications.

As of right now, an x86_64 libc can have good performance on any
recent kernel and will work correctly on any kernel.  From what you're
saying, it sounds impossible to implement such a thing on ARM without
fiddling with /proc.

--Andy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 21:30         ` Russell King - ARM Linux
  2014-06-27 21:47           ` Andy Lutomirski
@ 2014-06-27 21:55           ` Rich Felker
  2014-06-27 22:17             ` Russell King - ARM Linux
  1 sibling, 1 reply; 30+ messages in thread
From: Rich Felker @ 2014-06-27 21:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 10:30:51PM +0100, Russell King - ARM Linux wrote:
> On Fri, Jun 27, 2014 at 11:09:31PM +0200, Szabolcs Nagy wrote:
> > i thought the helpers in the kernel can avoid certain memory
> > barriers that the userspace has to do on armv6 for atomics
> > (and those barriers are deprecated on armv7 so i thought the
> > kuser page was better for portable binaries)
> 
> The helpers are provided so that libc can be independent of the CPU
> facilities in the machine.  The key word there is _libc_, not
> applications.
> 
> So, a libc can be built to support the lowest architecture that
> someone deems to support, and it may make use of the kuser helpers.
> If it does, then you have a libc which requires that the kuser
> helpers are always provided by the kernel, and the KUSER_HELPERS
> option must never be disabled.  If it is disabled, then the libc
> will be useless against that kernel.
> 
> However, a libc built against modern architectures should not be
> making use of the kuser helpers.  We found last year that the
> Ubuntu 12.04 glibc did still make use of one kuser helper, and
> as such Ubuntu 12.04 also needs KUSER_HELPERS to remain enabled.
> 
> The last combination is that the libc is built for modern architectures
> without needing any kuser helpers at all.  In this case - and only this
> case - the kernel's KUSER_HELPERS option can be disabled should the
> system integrator want to increase security.

I think you're assuming that libc is used only as a shared library and
that the user installs one appropriate for their kernel. This
precludes the use of static-linked binaries which are an extremely
important usage case for us, especially on ARM where, for example, we
want users to be able to make binaries that have a fully-working libc
but that can be run on Android, where neither musl nor any other
remotely-working libc is installed by default.

Obviously some (many) users will opt to build libc with a particular
-march where all of the necessary instructions for TLS and atomics are
available without help from the kernel. However, if attempting to
build a baseline libc that works on any model results in one that
can't work on new hardware/kernel, that's a big problem, and exactly
the one which I'm trying to solve.

Rich

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 21:47           ` Andy Lutomirski
@ 2014-06-27 21:58             ` Rich Felker
  0 siblings, 0 replies; 30+ messages in thread
From: Rich Felker @ 2014-06-27 21:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 02:47:38PM -0700, Andy Lutomirski wrote:
> >> > Due to that, any ARMv5 or earlier CPU will always have the kuser helper
> >> > page.  ARMv6 and later may or may not have the kuser helper page, but
> >> > there you're really building for a different ABI anyway (VFP-based) and
> >> > you also know that you have the thread registers.
> >>
> >> so is it expected that the libc makes no attempt to provide
> >> portable binary interface for armv5 and armv6?
> >
> > The libc interface that applications make use should not have any
> > dependence on whether KUSER_HELPERS is enabled or disabled, the
> > presence of that page should be totally invisible to applications.
> 
> As of right now, an x86_64 libc can have good performance on any
> recent kernel and will work correctly on any kernel.  From what you're
> saying, it sounds impossible to implement such a thing on ARM without
> fiddling with /proc.

Indeed. I wasn't even aware of the legacy vsyscall mess for x86_64,
and we're not using it in musl. Keeping it around seems like just a
matter of maintaining API/ABI compatibility with legacy versions of
glibc that are using it. On the other hand, the kuser helper page is
the ONLY way for a libc that's compatible with pre-v6 arm to get
working atomics and TLS.

Rich

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 21:37     ` Rich Felker
@ 2014-06-27 22:04       ` Russell King - ARM Linux
  2014-06-27 22:26         ` Rich Felker
  2014-06-28  7:09         ` u-igbb at aetey.se
  0 siblings, 2 replies; 30+ messages in thread
From: Russell King - ARM Linux @ 2014-06-27 22:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 05:37:43PM -0400, Rich Felker wrote:
> On Fri, Jun 27, 2014 at 12:27:45PM -0700, Andy Lutomirski wrote:
> > Hi ARM people and Kees-
> > 
> > The "vectors" page appears to be an abomination that's a lot like the
> > x86_64 vsyscall page.  IMO it should be phased out.
> 
> I'm not a fan of this extreme approach, but if it's taken, there needs
> to be some way to continue to make universal binaries which work
> safely on:
> 
> - Pre-v6, v6, and v7+ hardware.

And that is to make use of the kuser page.

> - Pre-removal and post-removal kernels.

Please, don't get stuff confused.

The kuser page isn't going anywhere _except_ for one situation: where
the system integrator has determined that they have built their
userspace not to require the kuser helpers, and wish to have greater
security and performance through doing so.

What we *aren't* doing with the kernel is removing the page entirely
for ARMv7+ CPUs.  We are merely giving system integrators the option
to increase security /provided/ their userspace is built appropriately.

If the userspace requires the kuser helpers, then the page *must* be
provided by the kernel.

It is a system integrator/distro bug to provide a kernel with the kuser
helpers disabled, but then to provide a libc which requires the page.

The system integrator has a choice to make:

1. Have the kuser helper page, have libc support for it, and have
   compatibility for all ARM CPUs - but have the (minor) security
   issues raised last year over the page.

or

2. Disable the kuser helper page on later CPUs which provide /all/
   the facilities userspace needs in hardware, and build for a
   minimum of those CPUs.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 21:55           ` Rich Felker
@ 2014-06-27 22:17             ` Russell King - ARM Linux
  2014-06-27 22:25               ` Andy Lutomirski
                                 ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Russell King - ARM Linux @ 2014-06-27 22:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 05:55:41PM -0400, Rich Felker wrote:
> I think you're assuming that libc is used only as a shared library and
> that the user installs one appropriate for their kernel. This
> precludes the use of static-linked binaries which are an extremely
> important usage case for us, especially on ARM where, for example, we
> want users to be able to make binaries that have a fully-working libc
> but that can be run on Android, where neither musl nor any other
> remotely-working libc is installed by default.
> 
> Obviously some (many) users will opt to build libc with a particular
> -march where all of the necessary instructions for TLS and atomics are
> available without help from the kernel. However, if attempting to
> build a baseline libc that works on any model results in one that
> can't work on new hardware/kernel, that's a big problem, and exactly
> the one which I'm trying to solve.

As I've already said, that's a system integrator bug to have a kernel
without a kuser page with userspace which requires it.

I think you're are missing one obvious solution to this which you can do:
you are passed the HWCAP fields in the ELF auxinfo.  This will tell you
if the CPU has TLS support or not.  If it has TLS support, then you don't
need to use the kuser helpers, and you know that it is a CPU which is ARM
architecture v6k or later, and it has things like the CP15 barrier
instructions.  If you want to know that the CPU supports the DMB
instruction rather than the CP15 barrier instruction, then you have to
check the uname details, or read /proc/cpuinfo (but I'd rather you
didn't.)

In addition, the HWCAP fields tell you about some of the other
instructions and FP options which are available to you, whether there's
integer division available, etc.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 22:17             ` Russell King - ARM Linux
@ 2014-06-27 22:25               ` Andy Lutomirski
  2014-06-27 22:54                 ` Russell King - ARM Linux
  2014-06-27 22:33               ` Rich Felker
  2014-06-27 22:40               ` Szabolcs Nagy
  2 siblings, 1 reply; 30+ messages in thread
From: Andy Lutomirski @ 2014-06-27 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 3:17 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Jun 27, 2014 at 05:55:41PM -0400, Rich Felker wrote:
>> I think you're assuming that libc is used only as a shared library and
>> that the user installs one appropriate for their kernel. This
>> precludes the use of static-linked binaries which are an extremely
>> important usage case for us, especially on ARM where, for example, we
>> want users to be able to make binaries that have a fully-working libc
>> but that can be run on Android, where neither musl nor any other
>> remotely-working libc is installed by default.
>>
>> Obviously some (many) users will opt to build libc with a particular
>> -march where all of the necessary instructions for TLS and atomics are
>> available without help from the kernel. However, if attempting to
>> build a baseline libc that works on any model results in one that
>> can't work on new hardware/kernel, that's a big problem, and exactly
>> the one which I'm trying to solve.
>
> As I've already said, that's a system integrator bug to have a kernel
> without a kuser page with userspace which requires it.

Shouldn't the goal be to reduce the number of new userspace programs
that require a kuser page on TLS-capable hardware?  In 2012, I made an
effort to do exactly that on x86_64 wrt the vsyscall page, and, these
days, a system booted with vsyscall=none is likely to be fully
functional as long as vdso=0 isn't specified.  Hopefully, in a couple
of years, even vdso=0 vsyscall=none will work with all freshly-built
binaries.

>
> I think you're are missing one obvious solution to this which you can do:
> you are passed the HWCAP fields in the ELF auxinfo.  This will tell you
> if the CPU has TLS support or not.  If it has TLS support, then you don't
> need to use the kuser helpers, and you know that it is a CPU which is ARM
> architecture v6k or later, and it has things like the CP15 barrier
> instructions.  If you want to know that the CPU supports the DMB
> instruction rather than the CP15 barrier instruction, then you have to
> check the uname details, or read /proc/cpuinfo (but I'd rather you
> didn't.)

That sounds helpful.  Would it make sense to try to convince all libc
providers (and Go!) to do this?

If DMB vs CP15 makes a big difference, then adding that to HWCAP might
be a good idea.

--Andy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 22:04       ` Russell King - ARM Linux
@ 2014-06-27 22:26         ` Rich Felker
  2014-06-27 23:03           ` Russell King - ARM Linux
  2014-06-28  7:09         ` u-igbb at aetey.se
  1 sibling, 1 reply; 30+ messages in thread
From: Rich Felker @ 2014-06-27 22:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 11:04:00PM +0100, Russell King - ARM Linux wrote:
> On Fri, Jun 27, 2014 at 05:37:43PM -0400, Rich Felker wrote:
> > On Fri, Jun 27, 2014 at 12:27:45PM -0700, Andy Lutomirski wrote:
> > > Hi ARM people and Kees-
> > > 
> > > The "vectors" page appears to be an abomination that's a lot like the
> > > x86_64 vsyscall page.  IMO it should be phased out.
> > 
> > I'm not a fan of this extreme approach, but if it's taken, there needs
> > to be some way to continue to make universal binaries which work
> > safely on:
> > 
> > - Pre-v6, v6, and v7+ hardware.
> 
> And that is to make use of the kuser page.
> 
> > - Pre-removal and post-removal kernels.
> 
> Please, don't get stuff confused.
> 
> The kuser page isn't going anywhere _except_ for one situation: where
> the system integrator has determined that they have built their
> userspace not to require the kuser helpers, and wish to have greater
> security and performance through doing so.
> 
> What we *aren't* doing with the kernel is removing the page entirely
> for ARMv7+ CPUs.  We are merely giving system integrators the option
> to increase security /provided/ their userspace is built appropriately.
> 
> If the userspace requires the kuser helpers, then the page *must* be
> provided by the kernel.
> 
> It is a system integrator/distro bug to provide a kernel with the kuser
> helpers disabled, but then to provide a libc which requires the page.

>From our standpoint it is not necessarily the system integrator/distro
who provides the libc, but rather the distributor of the (possibly
static-linked, or possibly dynamic-linked but with its own full set of
libs) application binary. As a provider of tools used to produce such
binaries, we want these binaries to work regardless of how the kernel
was configured.

Thus, my interest is in ensuring that, whenever the kuser helper page
is disabled (or, as proposed, moved to a dynamic address), there's a
safe and efficient way to detect this condition and know what to do to
work around it.

Rich

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 22:17             ` Russell King - ARM Linux
  2014-06-27 22:25               ` Andy Lutomirski
@ 2014-06-27 22:33               ` Rich Felker
  2014-06-27 23:07                 ` Russell King - ARM Linux
  2014-06-27 22:40               ` Szabolcs Nagy
  2 siblings, 1 reply; 30+ messages in thread
From: Rich Felker @ 2014-06-27 22:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 11:17:44PM +0100, Russell King - ARM Linux wrote:
> On Fri, Jun 27, 2014 at 05:55:41PM -0400, Rich Felker wrote:
> > I think you're assuming that libc is used only as a shared library and
> > that the user installs one appropriate for their kernel. This
> > precludes the use of static-linked binaries which are an extremely
> > important usage case for us, especially on ARM where, for example, we
> > want users to be able to make binaries that have a fully-working libc
> > but that can be run on Android, where neither musl nor any other
> > remotely-working libc is installed by default.
> > 
> > Obviously some (many) users will opt to build libc with a particular
> > -march where all of the necessary instructions for TLS and atomics are
> > available without help from the kernel. However, if attempting to
> > build a baseline libc that works on any model results in one that
> > can't work on new hardware/kernel, that's a big problem, and exactly
> > the one which I'm trying to solve.
> 
> As I've already said, that's a system integrator bug to have a kernel
> without a kuser page with userspace which requires it.
> 
> I think you're are missing one obvious solution to this which you can do:
> you are passed the HWCAP fields in the ELF auxinfo.  This will tell you
> if the CPU has TLS support or not.  If it has TLS support, then you don't
> need to use the kuser helpers, and you know that it is a CPU which is ARM
> architecture v6k or later,

Are you sure? I think the minimum models for TLS and for LDREX/STREX
are different; one is plain v6 and the other is v6k. This greatly
complicates the issue since the kernel only tells you if the TLS
register is available, not if the atomics are. It also doesn't tell
you if the kernel is emulating the hardware features via traps rather
than providing the kuser helper page, which would be a semi-viable
configuration.

> and it has things like the CP15 barrier
> instructions.

No, the CP15 barrier is deprecated and may be removed in futuer chips
(like the SWP instruction was?), so it can't be used unless you know
that you have a pre-DMB machine that needs it.

> If you want to know that the CPU supports the DMB
> instruction rather than the CP15 barrier instruction, then you have to
> check the uname details, or read /proc/cpuinfo (but I'd rather you
> didn't.)

These are not safe operations. /proc may not be mounted, or the file
descriptor table may be full and open may fail, etc.

> In addition, the HWCAP fields tell you about some of the other
> instructions and FP options which are available to you, whether there's
> integer division available, etc.

Yes but for all of those it's safe to assume the lowest baseline. For
TLS and atomics, removal (even optional) of kuser helper page means
it's not safe to assume the lowest baseline; there MUST be a fallback
to use the higher-model-only instructions if the kernel lacks kuser
helper.

Rich

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 22:17             ` Russell King - ARM Linux
  2014-06-27 22:25               ` Andy Lutomirski
  2014-06-27 22:33               ` Rich Felker
@ 2014-06-27 22:40               ` Szabolcs Nagy
  2014-06-27 22:51                 ` Andy Lutomirski
  2014-06-27 23:12                 ` Russell King - ARM Linux
  2 siblings, 2 replies; 30+ messages in thread
From: Szabolcs Nagy @ 2014-06-27 22:40 UTC (permalink / raw)
  To: linux-arm-kernel

* Russell King - ARM Linux <linux@arm.linux.org.uk> [2014-06-27 23:17:44 +0100]:
> I think you're are missing one obvious solution to this which you can do:
> you are passed the HWCAP fields in the ELF auxinfo.  This will tell you
> if the CPU has TLS support or not.  If it has TLS support, then you don't
> need to use the kuser helpers, and you know that it is a CPU which is ARM
> architecture v6k or later, and it has things like the CP15 barrier
> instructions.  If you want to know that the CPU supports the DMB
> instruction rather than the CP15 barrier instruction, then you have to
> check the uname details, or read /proc/cpuinfo (but I'd rather you
> didn't.)
> 

but cp15 barrier is deprecated on armv7+

and i think the kernel can avoid the barriers on non-smp systems making
kuser helpers possibly preferable even if TLS HWCAP flag is set

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 22:40               ` Szabolcs Nagy
@ 2014-06-27 22:51                 ` Andy Lutomirski
  2014-06-27 23:12                 ` Russell King - ARM Linux
  1 sibling, 0 replies; 30+ messages in thread
From: Andy Lutomirski @ 2014-06-27 22:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 3:40 PM, Szabolcs Nagy <nsz@port70.net> wrote:
> * Russell King - ARM Linux <linux@arm.linux.org.uk> [2014-06-27 23:17:44 +0100]:
>> I think you're are missing one obvious solution to this which you can do:
>> you are passed the HWCAP fields in the ELF auxinfo.  This will tell you
>> if the CPU has TLS support or not.  If it has TLS support, then you don't
>> need to use the kuser helpers, and you know that it is a CPU which is ARM
>> architecture v6k or later, and it has things like the CP15 barrier
>> instructions.  If you want to know that the CPU supports the DMB
>> instruction rather than the CP15 barrier instruction, then you have to
>> check the uname details, or read /proc/cpuinfo (but I'd rather you
>> didn't.)
>>
>
> but cp15 barrier is deprecated on armv7+
>
> and i think the kernel can avoid the barriers on non-smp systems making
> kuser helpers possibly preferable even if TLS HWCAP flag is set

IMO it would be much better to do this by either telling userspace
that the system is !SMP or by allowing the kuser helpers to live at a
randomized address.

There's already been at least one paper describing an exploit
technique that's blocked by vsyscall emulation.  This kind of
hardening takes a while to get deployed, and making new userspace not
depend on fixed-address code is a good thing to get started on, at
least when targetting newer hardware.

--Andy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 22:25               ` Andy Lutomirski
@ 2014-06-27 22:54                 ` Russell King - ARM Linux
  2014-06-28  0:11                   ` Rich Felker
  0 siblings, 1 reply; 30+ messages in thread
From: Russell King - ARM Linux @ 2014-06-27 22:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 03:25:32PM -0700, Andy Lutomirski wrote:
> On Fri, Jun 27, 2014 at 3:17 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Fri, Jun 27, 2014 at 05:55:41PM -0400, Rich Felker wrote:
> >> I think you're assuming that libc is used only as a shared library and
> >> that the user installs one appropriate for their kernel. This
> >> precludes the use of static-linked binaries which are an extremely
> >> important usage case for us, especially on ARM where, for example, we
> >> want users to be able to make binaries that have a fully-working libc
> >> but that can be run on Android, where neither musl nor any other
> >> remotely-working libc is installed by default.
> >>
> >> Obviously some (many) users will opt to build libc with a particular
> >> -march where all of the necessary instructions for TLS and atomics are
> >> available without help from the kernel. However, if attempting to
> >> build a baseline libc that works on any model results in one that
> >> can't work on new hardware/kernel, that's a big problem, and exactly
> >> the one which I'm trying to solve.
> >
> > As I've already said, that's a system integrator bug to have a kernel
> > without a kuser page with userspace which requires it.
> 
> Shouldn't the goal be to reduce the number of new userspace programs
> that require a kuser page on TLS-capable hardware?  In 2012, I made an
> effort to do exactly that on x86_64 wrt the vsyscall page, and, these
> days, a system booted with vsyscall=none is likely to be fully
> functional as long as vdso=0 isn't specified.  Hopefully, in a couple
> of years, even vdso=0 vsyscall=none will work with all freshly-built
> binaries.

You mean like running Ubuntu 14.04 (which is built for ARMv7 hard
float) does not require the kuser page for anything.  Ubuntu 12.04
needing it was rather unexpected; that came down to a glibc
configuration error.

Fedora doesn't support anything before ARMv6, but they also provide
ARMv7 optimised packages as well, and it's highly likely that the
ARMv7 packages don't need the page either.

I believe Android has moved in that direction too.

I don't know whether things like arch or debian have moved in that
direction yet, but I would be very surprised if they haven't.

> > I think you're are missing one obvious solution to this which you can do:
> > you are passed the HWCAP fields in the ELF auxinfo.  This will tell you
> > if the CPU has TLS support or not.  If it has TLS support, then you don't
> > need to use the kuser helpers, and you know that it is a CPU which is ARM
> > architecture v6k or later, and it has things like the CP15 barrier
> > instructions.  If you want to know that the CPU supports the DMB
> > instruction rather than the CP15 barrier instruction, then you have to
> > check the uname details, or read /proc/cpuinfo (but I'd rather you
> > didn't.)
> 
> That sounds helpful.  Would it make sense to try to convince all libc
> providers (and Go!) to do this?

I'm not sure it's worth the effort - as mentioned above, I suspect
most distros have, or are dropping support for the older architectures
which don't provide all the bells and whistles.

> If DMB vs CP15 makes a big difference, then adding that to HWCAP might
> be a good idea.

The CP15 version of the instruction (introduced in ARMv6) has been
deprecated in ARMv7, though we still use the CP15 instruction in the
kernel if we're including support for ARMv6 - we only use the ARMv7
DMB instruction when we're building only for ARMv7 architectures.

Oh, I should also have mentioned: for a libc, if you want to stretch
across from ARMv4 all the way up to ARMv7, then you have to do lots
more than just worry about thread local storage.  You also have the
problem that you can't just fall back on the SWP instruction to
provide atomic implementations - this instruction has been deprecated
and for the latest CPUs, the kernel may be configured to emulate this
instruction.  Besides, on ARMv6 and later, you really want to use the
load/store exclusive instructions for implementing atomic accesses
and not the horrid SWP instruction.  So you need to implement atomic
stuff using SWP for some CPUs and the new load/store exclusive for
other CPUs.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 22:26         ` Rich Felker
@ 2014-06-27 23:03           ` Russell King - ARM Linux
  0 siblings, 0 replies; 30+ messages in thread
From: Russell King - ARM Linux @ 2014-06-27 23:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 06:26:26PM -0400, Rich Felker wrote:
> Thus, my interest is in ensuring that, whenever the kuser helper page
> is disabled (or, as proposed, moved to a dynamic address), there's a
> safe and efficient way to detect this condition and know what to do to
> work around it.

Let me be clear: we are not going to move it to a dynamic address.
That's far too big a problem for the older architectures since
their caches alias all over the place.  Plus - and more importantly -
it means an ABI breakage across the board.

Let's say we did move it into a randomly mapped page in userspace.
We also need to keep the kuser page in place in order to support
existing libcs as well, but now we have not one but two places to
poke thread local values to - and we now have to flush the cache
line associated with that too, or have the kernel write directly
to that page in userspace.

We can't share those pages, because there's the danger that something
could write to the vectors page, and thus end up taking over the
whole machine.  Even if we did, we still have to flush the cache at
not just one location, but two.

As I've already pointed out, there's easier solutions to this problem
(using the ELF HWCAPS).

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 22:33               ` Rich Felker
@ 2014-06-27 23:07                 ` Russell King - ARM Linux
  2014-06-27 23:17                   ` Andy Lutomirski
  0 siblings, 1 reply; 30+ messages in thread
From: Russell King - ARM Linux @ 2014-06-27 23:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 06:33:28PM -0400, Rich Felker wrote:
> On Fri, Jun 27, 2014 at 11:17:44PM +0100, Russell King - ARM Linux wrote:
> > As I've already said, that's a system integrator bug to have a kernel
> > without a kuser page with userspace which requires it.
> > 
> > I think you're are missing one obvious solution to this which you can do:
> > you are passed the HWCAP fields in the ELF auxinfo.  This will tell you
> > if the CPU has TLS support or not.  If it has TLS support, then you don't
> > need to use the kuser helpers, and you know that it is a CPU which is ARM
> > architecture v6k or later,
> 
> Are you sure? I think the minimum models for TLS and for LDREX/STREX
> are different; one is plain v6 and the other is v6k.

LDREX/STREX is v6+.  TLS is v6k+.

> This greatly
> complicates the issue since the kernel only tells you if the TLS
> register is available, not if the atomics are.

uname() ?

> > If you want to know that the CPU supports the DMB
> > instruction rather than the CP15 barrier instruction, then you have to
> > check the uname details, or read /proc/cpuinfo (but I'd rather you
> > didn't.)
> 
> These are not safe operations. /proc may not be mounted, or the file
> descriptor table may be full and open may fail, etc.

Balderdash.  Since when has uname not been safe?

> > In addition, the HWCAP fields tell you about some of the other
> > instructions and FP options which are available to you, whether there's
> > integer division available, etc.
> 
> Yes but for all of those it's safe to assume the lowest baseline. For
> TLS and atomics, removal (even optional) of kuser helper page means
> it's not safe to assume the lowest baseline; there MUST be a fallback
> to use the higher-model-only instructions if the kernel lacks kuser
> helper.

The kuser helpers can NOT be removed unless the CPU is v6k+.  Let me
put that a different way: the kuser helpers can not be removed unless
the ELF HWCAPs indicate TLS support.

If the ELF HWCAPs indicate TLS support, then you have atomics and you
have TLS registers, and you may or may not have the kuser helpers.
If it indicates no TLS support, then you will always have kuser helpers.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 22:40               ` Szabolcs Nagy
  2014-06-27 22:51                 ` Andy Lutomirski
@ 2014-06-27 23:12                 ` Russell King - ARM Linux
  2014-06-28 16:37                   ` Szabolcs Nagy
  1 sibling, 1 reply; 30+ messages in thread
From: Russell King - ARM Linux @ 2014-06-27 23:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jun 28, 2014 at 12:40:17AM +0200, Szabolcs Nagy wrote:
> * Russell King - ARM Linux <linux@arm.linux.org.uk> [2014-06-27 23:17:44 +0100]:
> > I think you're are missing one obvious solution to this which you can do:
> > you are passed the HWCAP fields in the ELF auxinfo.  This will tell you
> > if the CPU has TLS support or not.  If it has TLS support, then you don't
> > need to use the kuser helpers, and you know that it is a CPU which is ARM
> > architecture v6k or later, and it has things like the CP15 barrier
> > instructions.  If you want to know that the CPU supports the DMB
> > instruction rather than the CP15 barrier instruction, then you have to
> > check the uname details, or read /proc/cpuinfo (but I'd rather you
> > didn't.)
> > 
> 
> but cp15 barrier is deprecated on armv7+

Can you _please_ read all my replies and stop cherry picking what you
want?  There's many of you, and only one of me - please do me the
effort of fully reading my replies before replying on points I've
already included.

I said just after you stopped reading the above quoted text:

"If you want to know that the CPU supports the DMB instruction rather
 than the CP15 barrier instruction, then you have to check the uname
 details"

That's because it's an _instruction_ _set_ _architecture_ issue, and
you can get that from uname.  If uname indicates v7, then you're running
on a CPU which has deprecated the CP15 barrier operations.  If it
indicates v6, then you're running on a CPU which has them but has no
DMB instruction etc.  If it's earlier than v6, you have no barrier
instructions what so ever.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 23:07                 ` Russell King - ARM Linux
@ 2014-06-27 23:17                   ` Andy Lutomirski
  2014-06-27 23:35                     ` Russell King - ARM Linux
  0 siblings, 1 reply; 30+ messages in thread
From: Andy Lutomirski @ 2014-06-27 23:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 4:07 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> The kuser helpers can NOT be removed unless the CPU is v6k+.  Let me
> put that a different way: the kuser helpers can not be removed unless
> the ELF HWCAPs indicate TLS support.

Why?  (This is an honest question -- there may be an excellent
answer.)  I understand why they're needed in the first place, but I
don't understand why they need to live at a fixed address.

The closest thing to this that I'm familiar with is x86_32's sysenter.
It's a very useful instruction, but it's basically impossible for libc
to contain a sysenter instruction in the libc image.  So the kernel
provides one *at a randomized address*, and libc calls it.

Admittedly, x86_32 has an advantage over ARM here: libc without a
sysenter helper is completely functional; it's just slower.
Nonetheless, ISTM it should be possible to start advertising the kuser
helper address to libc, get all the libcs to play along, and then
offer an option of randomizing it for people who know that they don't
have any old libcs on their systems.

--Andy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Thread pointer changes
  2014-06-27 19:27   ` Thread pointer changes Andy Lutomirski
  2014-06-27 20:09     ` Russell King - ARM Linux
  2014-06-27 21:37     ` Rich Felker
@ 2014-06-27 23:20     ` Russell King - ARM Linux
  2014-06-28  0:38       ` [musl] " Rich Felker
  2 siblings, 1 reply; 30+ messages in thread
From: Russell King - ARM Linux @ 2014-06-27 23:20 UTC (permalink / raw)
  To: linux-arm-kernel

Right, I'm done on this thread for the weekend - I will not be answering
another message on this subject until next week.  In summary:

I've shown that this isn't as big a problem as first thought, because
there's ways that a libc can trivially detect the CPU that it is running
on, and from that know what instructions are available to it.

I've indicated that the kuser helpers are always provided when there
is no hardware TLS support, which corresponds with a minimum ARM
architecture of version 6K, and v6k has the atomic instructions.

I've said that we're not going to move kuser helpers to a randomised
address, and given strong reasons why not.

I've indicated where the CPU architecture can be retrieved from, and
used to determine the availability of other instructions.

I've indicated that the ELF HWCAPs can be used to further refine the
available instruction information.

That should be sufficient to answer all the questions raised.  Please
wait until next week before asking further questions, thanks.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 23:17                   ` Andy Lutomirski
@ 2014-06-27 23:35                     ` Russell King - ARM Linux
  2014-06-27 23:40                       ` Andy Lutomirski
  2014-06-28  0:20                       ` Rich Felker
  0 siblings, 2 replies; 30+ messages in thread
From: Russell King - ARM Linux @ 2014-06-27 23:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 04:17:52PM -0700, Andy Lutomirski wrote:
> On Fri, Jun 27, 2014 at 4:07 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > The kuser helpers can NOT be removed unless the CPU is v6k+.  Let me
> > put that a different way: the kuser helpers can not be removed unless
> > the ELF HWCAPs indicate TLS support.
> 
> Why?  (This is an honest question -- there may be an excellent
> answer.)  I understand why they're needed in the first place, but I
> don't understand why they need to live at a fixed address.

Just very briefly - and this will be my last reply on this until Monday,
especially so as I've spent a solid 1h30 replying to the emails in this
thread, and it's past midnight here.

I referred to the problems in an earlier reply (maybe in a different
sub-thread of this thread.)

To answer your "why" (which I'll take as "why can't they be removed")
the reason for that is the lack of TLS support.  If the kuser page
is removed _and_ there is no hardware TLS support, then there is no way
for userspace to have TLS.

As for your second bit (about why they need to live at a fixed address)
we could have randomised it on v6 and later CPUs, but there are some
v6 CPUs which suffer from data cache aliasing, just like all the previous
CPUs.  With the data cache aliasing, it would make the TLS implementation
a lot more complex.

As part of the TLS implementation for older CPUs, the TLS value is stored
right at the top of that page, and on every context switch, we have to
update that value.  If the page was at a randomised address, the kernel
would either have to poke directly into userspace and flush it from the
caches (which is error prone - what if userspace unmaps the page) or it
has to do cache flushing to ensure that the value is visible via the
user alias of the page.

If it wasn't for this need, we would have probably gone for a VDSO from
the start.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 23:35                     ` Russell King - ARM Linux
@ 2014-06-27 23:40                       ` Andy Lutomirski
  2014-06-30 15:38                         ` Christopher Covington
  2014-06-28  0:20                       ` Rich Felker
  1 sibling, 1 reply; 30+ messages in thread
From: Andy Lutomirski @ 2014-06-27 23:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 4:35 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Jun 27, 2014 at 04:17:52PM -0700, Andy Lutomirski wrote:
>> On Fri, Jun 27, 2014 at 4:07 PM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > The kuser helpers can NOT be removed unless the CPU is v6k+.  Let me
>> > put that a different way: the kuser helpers can not be removed unless
>> > the ELF HWCAPs indicate TLS support.
>>
>> Why?  (This is an honest question -- there may be an excellent
>> answer.)  I understand why they're needed in the first place, but I
>> don't understand why they need to live at a fixed address.
>
> Just very briefly - and this will be my last reply on this until Monday,
> especially so as I've spent a solid 1h30 replying to the emails in this
> thread, and it's past midnight here.
>
> I referred to the problems in an earlier reply (maybe in a different
> sub-thread of this thread.)
>
> To answer your "why" (which I'll take as "why can't they be removed")
> the reason for that is the lack of TLS support.  If the kuser page
> is removed _and_ there is no hardware TLS support, then there is no way
> for userspace to have TLS.
>
> As for your second bit (about why they need to live at a fixed address)
> we could have randomised it on v6 and later CPUs, but there are some
> v6 CPUs which suffer from data cache aliasing, just like all the previous
> CPUs.  With the data cache aliasing, it would make the TLS implementation
> a lot more complex.

Right, got it.

>
> As part of the TLS implementation for older CPUs, the TLS value is stored
> right at the top of that page, and on every context switch, we have to
> update that value.  If the page was at a randomised address, the kernel
> would either have to poke directly into userspace and flush it from the
> caches (which is error prone - what if userspace unmaps the page) or it
> has to do cache flushing to ensure that the value is visible via the
> user alias of the page.
>

Hmm.  Maybe some day there'll be a reliable way to track a vdso.  This
currently doesn't exist, but x86_32 will need it if CRIU will ever
work there.

If you ever want an ARM vdso (e.g. for timing), I'd be happy to help
and try to share code with x86.

--Andy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 22:54                 ` Russell King - ARM Linux
@ 2014-06-28  0:11                   ` Rich Felker
  0 siblings, 0 replies; 30+ messages in thread
From: Rich Felker @ 2014-06-28  0:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 11:54:45PM +0100, Russell King - ARM Linux wrote:
> Oh, I should also have mentioned: for a libc, if you want to stretch
> across from ARMv4 all the way up to ARMv7, then you have to do lots
> more than just worry about thread local storage.  You also have the
> problem that you can't just fall back on the SWP instruction to
> provide atomic implementations - this instruction has been deprecated
> and for the latest CPUs, the kernel may be configured to emulate this
> instruction.  Besides, on ARMv6 and later, you really want to use the
> load/store exclusive instructions for implementing atomic accesses
> and not the horrid SWP instruction.  So you need to implement atomic
> stuff using SWP for some CPUs and the new load/store exclusive for
> other CPUs.

SWP is mostly useless for implementing the POSIX threads primitives,
most of which need compare-and-swap due to the requirement of
self-synchronized destruction. At the source level we do a swap
operation in a couple places, but so far on ARM it's just implemented
in terms of CAS. We could revisit this once there's runtime selection
of the lock code anyway; using SWP might make a big difference for
internal locks (e.g. in malloc) where the CAS semantics are not
needed.

Rich

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 23:35                     ` Russell King - ARM Linux
  2014-06-27 23:40                       ` Andy Lutomirski
@ 2014-06-28  0:20                       ` Rich Felker
  1 sibling, 0 replies; 30+ messages in thread
From: Rich Felker @ 2014-06-28  0:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jun 28, 2014 at 12:35:05AM +0100, Russell King - ARM Linux wrote:
> On Fri, Jun 27, 2014 at 04:17:52PM -0700, Andy Lutomirski wrote:
> > On Fri, Jun 27, 2014 at 4:07 PM, Russell King - ARM Linux
> > <linux@arm.linux.org.uk> wrote:
> > > The kuser helpers can NOT be removed unless the CPU is v6k+.  Let me
> > > put that a different way: the kuser helpers can not be removed unless
> > > the ELF HWCAPs indicate TLS support.
> > 
> > Why?  (This is an honest question -- there may be an excellent
> > answer.)  I understand why they're needed in the first place, but I
> > don't understand why they need to live at a fixed address.
> 
> Just very briefly - and this will be my last reply on this until Monday,
> especially so as I've spent a solid 1h30 replying to the emails in this
> thread, and it's past midnight here.

Understood. Don't feel like you have to reply to the further emails in
this thread; we can wait until you have time.

> To answer your "why" (which I'll take as "why can't they be removed")
> the reason for that is the lack of TLS support.  If the kuser page
> is removed _and_ there is no hardware TLS support, then there is no way
> for userspace to have TLS.
> 
> As for your second bit (about why they need to live at a fixed address)
> we could have randomised it on v6 and later CPUs, but there are some
> v6 CPUs which suffer from data cache aliasing, just like all the previous
> CPUs.  With the data cache aliasing, it would make the TLS implementation
> a lot more complex.
> 
> As part of the TLS implementation for older CPUs, the TLS value is stored
> right at the top of that page, and on every context switch, we have to
> update that value.  If the page was at a randomised address, the kernel
> would either have to poke directly into userspace and flush it from the
> caches (which is error prone - what if userspace unmaps the page) or it
> has to do cache flushing to ensure that the value is visible via the
> user alias of the page.
> 
> If it wasn't for this need, we would have probably gone for a VDSO from
> the start.

Thank you for taking the time to give the technical explanation of why
it is the way it is. That makes a lot more sense, and indeed it makes
the option of having a randomized address seem a lot less attractive.

Rich

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 23:20     ` Russell King - ARM Linux
@ 2014-06-28  0:38       ` Rich Felker
  0 siblings, 0 replies; 30+ messages in thread
From: Rich Felker @ 2014-06-28  0:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jun 28, 2014 at 12:20:57AM +0100, Russell King - ARM Linux wrote:
> I've shown that this isn't as big a problem as first thought, because
> there's ways that a libc can trivially detect the CPU that it is running
> on, and from that know what instructions are available to it.

I'd really rather be able to detect "Is kuser helper page available?"
than "Are the instructions to get by without it available?" The big
reason is that, when you want a binary that works "anywhere" on an ISA
like ARM where the vendor keeps deprecating instructions and replacing
them with new ones, it's safer to rely on the kernel to tell you which
instructions work than having hard-coded ones that might start failing
somewhere down the line. The other reason is that the pre-v6 kuser
compare-and-swap is actually faster on non-SMP machines since it
doesn't have to do any locking or barrier.

I do actually have a solution for this: using process_vm_readv, the
existence of the kuser helper page can be probed to determine if it's
safe to access, and the kuser version number can be read. But this is
somewhat hackish and might have other downsides.

> I've indicated that the kuser helpers are always provided when there
> is no hardware TLS support, which corresponds with a minimum ARM
> architecture of version 6K, and v6k has the atomic instructions.

Thanks. I wasn't aware that !(hwcap & HWCAP_TLS) was a sufficient
condition to ensure that the kuser helper page is available. In theory
SYS_get_thread_area or trap-emulated TLS access could have been
required in this case but I'm glad to hear that they're not.

> I've said that we're not going to move kuser helpers to a randomised
> address, and given strong reasons why not.

Understood.

> I've indicated where the CPU architecture can be retrieved from, and
> used to determine the availability of other instructions.

This is another area where I feel there is some deficiency: deducing
from a cpu string the availability or non-availability of features
sounds fragile. I think it can be done efficiently via AT_PLATFORM,
and like you said SYS_uname is probably safe/reliable too, but I'd
much rather have something in the form of explicit capabilities rather
than having to infer them from a cpu model string.

> I've indicated that the ELF HWCAPs can be used to further refine the
> available instruction information.

With the exception of knowing whether to use DMB for barriers.

Rich

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 22:04       ` Russell King - ARM Linux
  2014-06-27 22:26         ` Rich Felker
@ 2014-06-28  7:09         ` u-igbb at aetey.se
  1 sibling, 0 replies; 30+ messages in thread
From: u-igbb at aetey.se @ 2014-06-28  7:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 27, 2014 at 11:04:00PM +0100, Russell King - ARM Linux wrote:
> The kuser page isn't going anywhere _except_ for one situation: where
> the system integrator has determined that they have built their
> userspace not to require the kuser helpers, and wish to have greater
> security and performance through doing so.

> If the userspace requires the kuser helpers, then the page *must* be
> provided by the kernel.
> 
> It is a system integrator/distro bug to provide a kernel with the kuser
> helpers disabled, but then to provide a libc which requires the page.
 ...
> The system integrator has a choice to make:
 [how to compile the kernel and the library]

I would like you to be aware that the assumed "system integrator/distro"
is not necessarily a single entity. Generally speaking it is different
parties who choose the kernel and the libc_s_ being used (note the plural).

Running statically linked binaries is not the only example of this.

Running applications from a global file system (where the shared
libraries being used are placed too) is what we do all the time.

We want the users to be able to run our software on all and any distros
(and they do). The common denominator is the Linux ABI against the
kernel. The more "locally defined" this ABI happens to be, the less
usable/useful it is.

This does not invalidate your point - of course the hardware
owner/administrator _may_ choose to render third-party applications
unusable on that hardware unit, that's fine.

But is he/she the same party as the "system integrator"? Not necessarily.

Regards,
Rune

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 23:12                 ` Russell King - ARM Linux
@ 2014-06-28 16:37                   ` Szabolcs Nagy
  0 siblings, 0 replies; 30+ messages in thread
From: Szabolcs Nagy @ 2014-06-28 16:37 UTC (permalink / raw)
  To: linux-arm-kernel

* Russell King - ARM Linux <linux@arm.linux.org.uk> [2014-06-28 00:12:24 +0100]:
> On Sat, Jun 28, 2014 at 12:40:17AM +0200, Szabolcs Nagy wrote:
> > but cp15 barrier is deprecated on armv7+
> 
> Can you _please_ read all my replies and stop cherry picking what you
> want?  There's many of you, and only one of me - please do me the
> effort of fully reading my replies before replying on points I've
> already included.

I've read it, we knew about uname, /proc and the TLS HWCAP flag,
and i thought it's clear that uname is inappropriate for determining
isa features because we don't know the arm naming scheme used in the
future..

so the situation is

- dmb vs cp15 dispatch needs extra uname syscall and fragile string
parsing at program startup

- there is no reasonable way to determine kuser page availability
on armv6k+ if pax kernel needs to be supported (asking the system
integrator or using process_vm_readv are not reasonable)

- there is no reasonable way to determine if the system is smp and
needs barriers (parsing /proc/cpuinfo is not reasonable)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-27 23:40                       ` Andy Lutomirski
@ 2014-06-30 15:38                         ` Christopher Covington
  2014-07-02 21:16                           ` Rich Felker
  0 siblings, 1 reply; 30+ messages in thread
From: Christopher Covington @ 2014-06-30 15:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/27/2014 07:40 PM, Andy Lutomirski wrote:

> If you ever want an ARM vdso (e.g. for timing), I'd be happy to help
> and try to share code with x86.

http://www.spinics.net/lists/arm-kernel/msg340661.html

Christopher

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by the Linux Foundation.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [musl] Re: Thread pointer changes
  2014-06-30 15:38                         ` Christopher Covington
@ 2014-07-02 21:16                           ` Rich Felker
  0 siblings, 0 replies; 30+ messages in thread
From: Rich Felker @ 2014-07-02 21:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 30, 2014 at 11:38:10AM -0400, Christopher Covington wrote:
> On 06/27/2014 07:40 PM, Andy Lutomirski wrote:
> 
> > If you ever want an ARM vdso (e.g. for timing), I'd be happy to help
> > and try to share code with x86.
> 
> http://www.spinics.net/lists/arm-kernel/msg340661.html

Great! Please let us know when this makes it into the official kernel
(so we have some assurance the interface is stable) and I'll add
support in musl as soon as we can find a system to test it on. Unless
there's any gratuitous incompatibility with vdso on other archs,
adding it on our side should be just a three-line patch to
arch/arm/syscall_arch.h.

Rich

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2014-07-02 21:16 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20140610072835.GA8466@brightrain.aerifal.cx>
     [not found] ` <20140611145533.GT179@brightrain.aerifal.cx>
2014-06-27 19:27   ` Thread pointer changes Andy Lutomirski
2014-06-27 20:09     ` Russell King - ARM Linux
2014-06-27 21:09       ` [musl] " Szabolcs Nagy
2014-06-27 21:30         ` Russell King - ARM Linux
2014-06-27 21:47           ` Andy Lutomirski
2014-06-27 21:58             ` Rich Felker
2014-06-27 21:55           ` Rich Felker
2014-06-27 22:17             ` Russell King - ARM Linux
2014-06-27 22:25               ` Andy Lutomirski
2014-06-27 22:54                 ` Russell King - ARM Linux
2014-06-28  0:11                   ` Rich Felker
2014-06-27 22:33               ` Rich Felker
2014-06-27 23:07                 ` Russell King - ARM Linux
2014-06-27 23:17                   ` Andy Lutomirski
2014-06-27 23:35                     ` Russell King - ARM Linux
2014-06-27 23:40                       ` Andy Lutomirski
2014-06-30 15:38                         ` Christopher Covington
2014-07-02 21:16                           ` Rich Felker
2014-06-28  0:20                       ` Rich Felker
2014-06-27 22:40               ` Szabolcs Nagy
2014-06-27 22:51                 ` Andy Lutomirski
2014-06-27 23:12                 ` Russell King - ARM Linux
2014-06-28 16:37                   ` Szabolcs Nagy
2014-06-27 21:37     ` Rich Felker
2014-06-27 22:04       ` Russell King - ARM Linux
2014-06-27 22:26         ` Rich Felker
2014-06-27 23:03           ` Russell King - ARM Linux
2014-06-28  7:09         ` u-igbb at aetey.se
2014-06-27 23:20     ` Russell King - ARM Linux
2014-06-28  0:38       ` [musl] " Rich Felker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.