linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] arm64: Expose TASK_SIZE to userspace via auxv
@ 2016-08-16 18:32 Christopher Covington
  2016-08-17 10:30 ` Catalin Marinas
  0 siblings, 1 reply; 8+ messages in thread
From: Christopher Covington @ 2016-08-16 18:32 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	Suzuki K Poulose, James Hogan, Christopher Covington,
	linux-arm-kernel, linux-kernel
  Cc: Jon Masters, Jeremy Linton

Some userspace applications need to know the maximum virtual address they can
use (TASK_SIZE). There are several possible values for TASK_SIZE with the arm64
kernel, and such applications are either making bad hard-coded assumptions, or
are guessing and checking using system calls like munmap(), which may have
other reasons for returning an error than TASK_SIZE being exceeded. To make
correct functioning easy for userspace applications that need to know the
maximum virtual address they can use, communicate TASK_SIZE via the ELF
auxiliary vector, just like PAGE_SIZE is currently communicated.

Signed-off-by: Christopher Covington <cov@codeaurora.org>
---
Tested with the following commands:
LD_SHOW_AUXV=1 sleep 1 # GNU dynamic ld-linux*.so
hexdump -v -e '4/4 "%08x " "\n"' /proc/self/auxv | \
  sed -r 's/0*([^ ]+) ([^ ]+) ([^ ]+) ([^ ]+)/\1 0x\4\3/
    s/^0 /    NULL: /
    s/^3 /    PHDR: /
    s/^4 /   PHENT: /
    s/^5 /   PHNUM: /
    s/^6 /  PAGESZ: /
    s/^7 /    BASE: /
    s/^8 /   FLAGS: /
    s/^9 /   ENTRY: /
    s/^b /     UID: /
    s/^c /    EUID: /
    s/^d /     GID: /
    s/^e /    EGID: /
    s/^f /PLATFORM: /
    s/^10 /   HWCAP: /
    s/^11 /  CLKTCK: /
    s/^17 /  SECURE: /
    s/^19 /  RANDOM: /
    s/^1f /  EXECFN: /
    s/^21 /    VDSO: /
    s/^22 /  TASKSZ: /' # compatible with static busybox
---
 arch/arm64/include/asm/elf.h         | 1 +
 arch/arm64/include/uapi/asm/auxvec.h | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index a55384f..3811795 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -145,6 +145,7 @@ typedef struct user_fpsimd_state elf_fpregset_t;
 do {									\
 	NEW_AUX_ENT(AT_SYSINFO_EHDR,					\
 		    (elf_addr_t)current->mm->context.vdso);		\
+	NEW_AUX_ENT(AT_TASKSZ, TASK_SIZE);				\
 } while (0)
 
 #define ARCH_HAS_SETUP_ADDITIONAL_PAGES
diff --git a/arch/arm64/include/uapi/asm/auxvec.h b/arch/arm64/include/uapi/asm/auxvec.h
index 4cf0c17..595bfda 100644
--- a/arch/arm64/include/uapi/asm/auxvec.h
+++ b/arch/arm64/include/uapi/asm/auxvec.h
@@ -18,7 +18,8 @@
 
 /* vDSO location */
 #define AT_SYSINFO_EHDR	33
+#define AT_TASKSZ	34
 
-#define AT_VECTOR_SIZE_ARCH 1 /* entries in ARCH_DLINFO */
+#define AT_VECTOR_SIZE_ARCH 2 /* entries in ARCH_DLINFO */
 
 #endif
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora
Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64: Expose TASK_SIZE to userspace via auxv
  2016-08-16 18:32 [PATCH] arm64: Expose TASK_SIZE to userspace via auxv Christopher Covington
@ 2016-08-17 10:30 ` Catalin Marinas
  2016-08-17 11:12   ` Christopher Covington
  0 siblings, 1 reply; 8+ messages in thread
From: Catalin Marinas @ 2016-08-17 10:30 UTC (permalink / raw)
  To: Christopher Covington
  Cc: Will Deacon, Ard Biesheuvel, Mark Rutland, Suzuki K Poulose,
	James Hogan, linux-arm-kernel, linux-kernel, Jon Masters,
	Jeremy Linton

On Tue, Aug 16, 2016 at 02:32:29PM -0400, Christopher Covington wrote:
> Some userspace applications need to know the maximum virtual address they can
> use (TASK_SIZE).

Just curious, what are the cases needing TASK_SIZE in user space?

-- 
Catalin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64: Expose TASK_SIZE to userspace via auxv
  2016-08-17 10:30 ` Catalin Marinas
@ 2016-08-17 11:12   ` Christopher Covington
  2016-08-18 12:00     ` Ard Biesheuvel
  2016-08-18 12:17     ` Richard Weinberger
  0 siblings, 2 replies; 8+ messages in thread
From: Christopher Covington @ 2016-08-17 11:12 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Ard Biesheuvel, Mark Rutland, Suzuki K Poulose,
	James Hogan, linux-arm-kernel, linux-kernel, Jon Masters,
	Jeremy Linton



On August 17, 2016 6:30:06 AM EDT, Catalin Marinas <catalin.marinas@arm.com> wrote:
>On Tue, Aug 16, 2016 at 02:32:29PM -0400, Christopher Covington wrote:
>> Some userspace applications need to know the maximum virtual address
>they can
>> use (TASK_SIZE).
>
>Just curious, what are the cases needing TASK_SIZE in user space?

Checkpoint/Restore In Userspace and the Mozilla Javascript Engine https://bugzilla.mozilla.org/show_bug.cgi?id=1143022 are the specific cases I've run into. I've heard LuaJIT might have a similar situation. In general I think making allocations from the top down is a shortcut for finding a large unused region of memory.

Thanks,
Cov

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

Sent from my Snapdragon powered Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64: Expose TASK_SIZE to userspace via auxv
  2016-08-17 11:12   ` Christopher Covington
@ 2016-08-18 12:00     ` Ard Biesheuvel
  2016-08-18 12:42       ` Catalin Marinas
  2016-08-18 12:17     ` Richard Weinberger
  1 sibling, 1 reply; 8+ messages in thread
From: Ard Biesheuvel @ 2016-08-18 12:00 UTC (permalink / raw)
  To: Christopher Covington
  Cc: Catalin Marinas, Will Deacon, Mark Rutland, Suzuki K Poulose,
	James Hogan, linux-arm-kernel, linux-kernel, Jon Masters,
	Jeremy Linton

On 17 August 2016 at 13:12, Christopher Covington <cov@codeaurora.org> wrote:
>
>
> On August 17, 2016 6:30:06 AM EDT, Catalin Marinas <catalin.marinas@arm.com> wrote:
>>On Tue, Aug 16, 2016 at 02:32:29PM -0400, Christopher Covington wrote:
>>> Some userspace applications need to know the maximum virtual address
>>they can
>>> use (TASK_SIZE).
>>
>>Just curious, what are the cases needing TASK_SIZE in user space?
>
> Checkpoint/Restore In Userspace and the Mozilla Javascript Engine https://bugzilla.mozilla.org/show_bug.cgi?id=1143022 are the specific cases I've run into. I've heard LuaJIT might have a similar situation. In general I think making allocations from the top down is a shortcut for finding a large unused region of memory.
>

One aspect of this that I would like to discuss is whether the current
practice makes sense, of tying TASK_SIZE to whatever the size of the
kernel VA space is.

I could imagine simply limiting the user VA space to 39-bits (or even
36-bits, depending on how deeply we care about 16 KB pages), and
implement an arch specific hook (prctl() perhaps?) to increase
TASK_SIZE on demand. That would not only give us a reliable way to
check whether this is supported (i.e., the prctl() would return error
if it isn't), it also allows for some optimizations, since a 48-bit VA
kernel can run all processes using 3 levels with relative ease (and
switching between 4levels and 3levels processes would also be
possible, but would either require a TLB flush, or would result in
this optimization to be disabled globally, whichever is less costly in
terms of performance)

-- 
Ard.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64: Expose TASK_SIZE to userspace via auxv
  2016-08-17 11:12   ` Christopher Covington
  2016-08-18 12:00     ` Ard Biesheuvel
@ 2016-08-18 12:17     ` Richard Weinberger
  2016-09-09 14:14       ` Christopher Covington
  1 sibling, 1 reply; 8+ messages in thread
From: Richard Weinberger @ 2016-08-18 12:17 UTC (permalink / raw)
  To: Christopher Covington
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	Suzuki K Poulose, James Hogan, linux-arm-kernel, LKML,
	Jon Masters, Jeremy Linton

On Wed, Aug 17, 2016 at 1:12 PM, Christopher Covington
<cov@codeaurora.org> wrote:
>
>
> On August 17, 2016 6:30:06 AM EDT, Catalin Marinas <catalin.marinas@arm.com> wrote:
>>On Tue, Aug 16, 2016 at 02:32:29PM -0400, Christopher Covington wrote:
>>> Some userspace applications need to know the maximum virtual address
>>they can
>>> use (TASK_SIZE).
>>
>>Just curious, what are the cases needing TASK_SIZE in user space?
>
> Checkpoint/Restore In Userspace and the Mozilla Javascript Engine https://bugzilla.mozilla.org/show_bug.cgi?id=1143022 are the specific cases I've run into. I've heard LuaJIT might have a similar situation. In general I think making allocations from the top down is a shortcut for finding a large unused region of memory.

I think this makes sense for all archs.
At lest UserModeLinux on x86 also needs to know bottom and top
addresses of the usable
address space.
Currently it figures by scanning and catching SIGSEGV.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64: Expose TASK_SIZE to userspace via auxv
  2016-08-18 12:00     ` Ard Biesheuvel
@ 2016-08-18 12:42       ` Catalin Marinas
  2016-08-18 13:18         ` Ard Biesheuvel
  0 siblings, 1 reply; 8+ messages in thread
From: Catalin Marinas @ 2016-08-18 12:42 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Christopher Covington, Mark Rutland, James Hogan,
	Suzuki K Poulose, Will Deacon, linux-kernel, Jeremy Linton,
	Jon Masters, linux-arm-kernel

On Thu, Aug 18, 2016 at 02:00:56PM +0200, Ard Biesheuvel wrote:
> On 17 August 2016 at 13:12, Christopher Covington <cov@codeaurora.org> wrote:
> > On August 17, 2016 6:30:06 AM EDT, Catalin Marinas <catalin.marinas@arm.com> wrote:
> >>On Tue, Aug 16, 2016 at 02:32:29PM -0400, Christopher Covington wrote:
> >>> Some userspace applications need to know the maximum virtual address
> >>they can
> >>> use (TASK_SIZE).
> >>
> >>Just curious, what are the cases needing TASK_SIZE in user space?
> >
> > Checkpoint/Restore In Userspace and the Mozilla Javascript Engine
> > https://bugzilla.mozilla.org/show_bug.cgi?id=1143022 are the
> > specific cases I've run into. I've heard LuaJIT might have a similar
> > situation. In general I think making allocations from the top down
> > is a shortcut for finding a large unused region of memory.
> 
> One aspect of this that I would like to discuss is whether the current
> practice makes sense, of tying TASK_SIZE to whatever the size of the
> kernel VA space is.

I'm fine with decoupling them as long as we can have sane
pgd/pud/pmd/pte macros. We rely on generic files line pgtable-nopud.h
etc. currently, so we would have to give up on that and do our own
checks. It's also worth testing any potential performance implication of
creating/tearing down large page tables with the new macros.

> I could imagine simply limiting the user VA space to 39-bits (or even
> 36-bits, depending on how deeply we care about 16 KB pages), and
> implement an arch specific hook (prctl() perhaps?) to increase
> TASK_SIZE on demand.

As you stated below, switching TASK_SIZE on demand is problematic if you
actually want a switch the TCR_EL1.T0SZ. As per other recent
discussions, I'm not sure we can do it safely without full TLBI on
context switch. That's an aspect we'll have to sort out with 52-bit VA
but most likely we'll allow this range in T0SZ and only artificially
limit TASK_SIZE to smaller values so that we don't break any other
tasks. But then you won't gain much from a reduced number of page table
levels.

> That would not only give us a reliable way to check whether this is
> supported (i.e., the prctl() would return error if it isn't), it also
> allows for some optimizations, since a 48-bit VA kernel can run all
> processes using 3 levels with relative ease (and switching between
> 4levels and 3levels processes would also be possible, but would either
> require a TLB flush, or would result in this optimization to be
> disabled globally, whichever is less costly in terms of performance)

I'm more for using 48-bit VA permanently for both user and kernel (and
52-bit VA at some point in the future, though limiting user space to
48-bit VA by default). But it would be good to get some benchmark
numbers on the impact to see whether it's still worth keeping the other
VA combinations around.

-- 
Catalin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64: Expose TASK_SIZE to userspace via auxv
  2016-08-18 12:42       ` Catalin Marinas
@ 2016-08-18 13:18         ` Ard Biesheuvel
  0 siblings, 0 replies; 8+ messages in thread
From: Ard Biesheuvel @ 2016-08-18 13:18 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Christopher Covington, Mark Rutland, James Hogan,
	Suzuki K Poulose, Will Deacon, linux-kernel, Jeremy Linton,
	Jon Masters, linux-arm-kernel

On 18 August 2016 at 14:42, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Thu, Aug 18, 2016 at 02:00:56PM +0200, Ard Biesheuvel wrote:
>> On 17 August 2016 at 13:12, Christopher Covington <cov@codeaurora.org> wrote:
>> > On August 17, 2016 6:30:06 AM EDT, Catalin Marinas <catalin.marinas@arm.com> wrote:
>> >>On Tue, Aug 16, 2016 at 02:32:29PM -0400, Christopher Covington wrote:
>> >>> Some userspace applications need to know the maximum virtual address
>> >>they can
>> >>> use (TASK_SIZE).
>> >>
>> >>Just curious, what are the cases needing TASK_SIZE in user space?
>> >
>> > Checkpoint/Restore In Userspace and the Mozilla Javascript Engine
>> > https://bugzilla.mozilla.org/show_bug.cgi?id=1143022 are the
>> > specific cases I've run into. I've heard LuaJIT might have a similar
>> > situation. In general I think making allocations from the top down
>> > is a shortcut for finding a large unused region of memory.
>>
>> One aspect of this that I would like to discuss is whether the current
>> practice makes sense, of tying TASK_SIZE to whatever the size of the
>> kernel VA space is.
>
> I'm fine with decoupling them as long as we can have sane
> pgd/pud/pmd/pte macros. We rely on generic files line pgtable-nopud.h
> etc. currently, so we would have to give up on that and do our own
> checks. It's also worth testing any potential performance implication of
> creating/tearing down large page tables with the new macros.
>

Well, I don't think it is necessarily worth the trouble of rewriting
all that. My concern is that TASK_SIZE randomly increased to 48 bits
recently, merely because some Freescale SoCs cannot fit their RAM into
the linear mapping on a 39-bit VA kernel. This had nothing to do with
userland requirements. Do we know the userland requirements? What use
cases do we know about that require >39 bit userland VA space?

>> I could imagine simply limiting the user VA space to 39-bits (or even
>> 36-bits, depending on how deeply we care about 16 KB pages), and
>> implement an arch specific hook (prctl() perhaps?) to increase
>> TASK_SIZE on demand.
>
> As you stated below, switching TASK_SIZE on demand is problematic if you
> actually want a switch the TCR_EL1.T0SZ. As per other recent
> discussions, I'm not sure we can do it safely without full TLBI on
> context switch. That's an aspect we'll have to sort out with 52-bit VA
> but most likely we'll allow this range in T0SZ and only artificially
> limit TASK_SIZE to smaller values so that we don't break any other
> tasks. But then you won't gain much from a reduced number of page table
> levels.
>

There are several ways to go about this. The 48-bit VA kernel could
run everything with 3 levels, and simply switch to 4 levels the moment
some process needs it. So we keep all the existing macros, but simply
point TTBR0_EL1 to the level 1 translation table rather than to the
level 0 table (and update T0SZ accordingly). So when the first 48 bit
VA userland process arrives (which may be never in many cases), we
either switch to 4 levels for everything (and the page tables are
already set up for that), or we do a TLB flush, but only when
switching from a 4levels task to a 3levels task or vice versa (but
this is messy so the first approach is probably more suitable)

So there is no associated space savings, only the TLB and cache
footprint gets optimized.

>> That would not only give us a reliable way to check whether this is
>> supported (i.e., the prctl() would return error if it isn't), it also
>> allows for some optimizations, since a 48-bit VA kernel can run all
>> processes using 3 levels with relative ease (and switching between
>> 4levels and 3levels processes would also be possible, but would either
>> require a TLB flush, or would result in this optimization to be
>> disabled globally, whichever is less costly in terms of performance)
>
> I'm more for using 48-bit VA permanently for both user and kernel (and
> 52-bit VA at some point in the future, though limiting user space to
> 48-bit VA by default). But it would be good to get some benchmark
> numbers on the impact to see whether it's still worth keeping the other
> VA combinations around.
>

Of course, none of this complexity is justified if the performance
impact is negligible. I do wonder about the virt case, though.

-- 
Ard.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64: Expose TASK_SIZE to userspace via auxv
  2016-08-18 12:17     ` Richard Weinberger
@ 2016-09-09 14:14       ` Christopher Covington
  0 siblings, 0 replies; 8+ messages in thread
From: Christopher Covington @ 2016-09-09 14:14 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	Suzuki K Poulose, James Hogan, linux-arm-kernel, LKML,
	Jon Masters, Jeremy Linton

Hi Richard,

On 08/18/2016 08:17 AM, Richard Weinberger wrote:
> On Wed, Aug 17, 2016 at 1:12 PM, Christopher Covington
> <cov@codeaurora.org> wrote:
>>
>>
>> On August 17, 2016 6:30:06 AM EDT, Catalin Marinas <catalin.marinas@arm.com> wrote:
>>> On Tue, Aug 16, 2016 at 02:32:29PM -0400, Christopher Covington wrote:
>>>> Some userspace applications need to know the maximum virtual address
>>> they can
>>>> use (TASK_SIZE).
>>>
>>> Just curious, what are the cases needing TASK_SIZE in user space?
>>
>> Checkpoint/Restore In Userspace and the Mozilla Javascript Engine https://bugzilla.mozilla.org/show_bug.cgi?id=1143022 are the specific cases I've run into. I've heard LuaJIT might have a similar situation. In general I think making allocations from the top down is a shortcut for finding a large unused region of memory.
> 
> I think this makes sense for all archs.
> At lest UserModeLinux on x86 also needs to know bottom and top
> addresses of the usable
> address space.
> Currently it figures by scanning and catching SIGSEGV.

For the bottom, can you use /proc/sys/vm/mmap_min_addr?

Cov

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code
Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-09-09 14:14 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-16 18:32 [PATCH] arm64: Expose TASK_SIZE to userspace via auxv Christopher Covington
2016-08-17 10:30 ` Catalin Marinas
2016-08-17 11:12   ` Christopher Covington
2016-08-18 12:00     ` Ard Biesheuvel
2016-08-18 12:42       ` Catalin Marinas
2016-08-18 13:18         ` Ard Biesheuvel
2016-08-18 12:17     ` Richard Weinberger
2016-09-09 14:14       ` Christopher Covington

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).