linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Question on the five-level page table support patches
@ 2017-04-23 10:53 John Paul Adrian Glaubitz
  2017-04-24  5:11 ` Andy Lutomirski
  2017-04-24 16:19 ` Kirill A. Shutemov
  0 siblings, 2 replies; 9+ messages in thread
From: John Paul Adrian Glaubitz @ 2017-04-23 10:53 UTC (permalink / raw)
  To: Kirill A. Shutemov, linux-kernel
  Cc: Andi Kleen, Dave Hansen, Andy Lutomirski, Michal Hocko,
	linux-arch, linux-mm

Hi Kirill!

I recently read the LWN article on your and your colleagues work to
add five-level page table support for x86 to the Linux kernel [1]
and I got your email address from the last patch of the series.

Since this extends the address space beyond 48-bits, as you may know,
it will cause potential headaches with Javascript engines which use
tagged pointers. On SPARC, the virtual address space already extends
to 52 bits and we are running into these very issues with Javascript
engines on SPARC.

Now, a possible way to mitigate this problem would be to pass the
"hint" parameter to mmap() in order to tell the kernel not to allocate
memory beyond the 48 bits address space. Unfortunately, on Linux this
will only work when the area pointed to by "hint" is unallocated which
means one cannot simply use a hardcoded "hint" to mitigate this problem.

However, since this trick still works on NetBSD and used to work on
Linux [3], I was wondering whether there are plans to bring back
this behavior to mmap() in Linux.

Currently, people are using ugly work-arounds [4] to address this
problem which involve a manual iteration over memory blocks and
basically implementing another allocator in the user space
application.

Thanks,
Adrian

> [1] https://lwn.net/Articles/717293/
> [2] https://lwn.net/Articles/717300/
> [3] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=824449#22
> [4] https://hg.mozilla.org/mozilla-central/rev/dfaafbaaa291

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz@debian.org
`. `'   Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question on the five-level page table support patches
  2017-04-23 10:53 Question on the five-level page table support patches John Paul Adrian Glaubitz
@ 2017-04-24  5:11 ` Andy Lutomirski
  2017-04-24 13:03   ` Andi Kleen
  2017-04-24 16:19 ` Kirill A. Shutemov
  1 sibling, 1 reply; 9+ messages in thread
From: Andy Lutomirski @ 2017-04-24  5:11 UTC (permalink / raw)
  To: John Paul Adrian Glaubitz
  Cc: Kirill A. Shutemov, linux-kernel, Andi Kleen, Dave Hansen,
	Michal Hocko, linux-arch, linux-mm

On Sun, Apr 23, 2017 at 3:53 AM, John Paul Adrian Glaubitz
<glaubitz@physik.fu-berlin.de> wrote:
> Hi Kirill!
>
> I recently read the LWN article on your and your colleagues work to
> add five-level page table support for x86 to the Linux kernel [1]
> and I got your email address from the last patch of the series.
>
> Since this extends the address space beyond 48-bits, as you may know,
> it will cause potential headaches with Javascript engines which use
> tagged pointers. On SPARC, the virtual address space already extends
> to 52 bits and we are running into these very issues with Javascript
> engines on SPARC.
>
> Now, a possible way to mitigate this problem would be to pass the
> "hint" parameter to mmap() in order to tell the kernel not to allocate
> memory beyond the 48 bits address space. Unfortunately, on Linux this
> will only work when the area pointed to by "hint" is unallocated which
> means one cannot simply use a hardcoded "hint" to mitigate this problem.
>
> However, since this trick still works on NetBSD and used to work on
> Linux [3], I was wondering whether there are plans to bring back
> this behavior to mmap() in Linux.
>
> Currently, people are using ugly work-arounds [4] to address this
> problem which involve a manual iteration over memory blocks and
> basically implementing another allocator in the user space
> application.
>
> Thanks,
> Adrian
>
>> [1] https://lwn.net/Articles/717293/
>> [2] https://lwn.net/Articles/717300/
>> [3] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=824449#22
>> [4] https://hg.mozilla.org/mozilla-central/rev/dfaafbaaa291
>

Can you explain what the issue is?  What used to work on Linux and
doesn't any more?  The man page is quite clear:

       MAP_FIXED
              Don't  interpret  addr  as  a hint: place the mapping at exactly
              that address.  addr must be a multiple of the page size.  If the
              memory  region  specified  by addr and len overlaps pages of any
              existing mapping(s), then the overlapped part  of  the  existing
              mapping(s)  will  be discarded.  If the specified address cannot
              be used, mmap() will fail.  Because requiring  a  fixed  address
              for  a  mapping is less portable, the use of this option is dis‐
              couraged.

and AFAIK Linux works exactly as documented.

FWIW, a patch to add a new MAP_ mode to tell mmap(2) to use the hinted
address if available and to *fail* if the hinted address is not
available would very likely be accepted and would IMO be much nicer
than the current behavior.

--Andy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question on the five-level page table support patches
  2017-04-24  5:11 ` Andy Lutomirski
@ 2017-04-24 13:03   ` Andi Kleen
  2017-04-24 20:33     ` John Paul Adrian Glaubitz
  0 siblings, 1 reply; 9+ messages in thread
From: Andi Kleen @ 2017-04-24 13:03 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: John Paul Adrian Glaubitz, Kirill A. Shutemov, linux-kernel,
	Dave Hansen, Michal Hocko, linux-arch, linux-mm

> Can you explain what the issue is?  What used to work on Linux and
> doesn't any more?  The man page is quite clear:

In old Linux hint was a search hint, so if there isn't a hole
at the hinted area it will search starting from there for a hole
instead of giving up immediately.

Now it just gives up, which means every user has to implement
their own search.

Yes I ran into the same problem and it's annoying. It broke
originally when top down mmap was added I believe

Before the augmented rbtree it was potentially very expensive, but now
it should be cheap.

-Andi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question on the five-level page table support patches
  2017-04-23 10:53 Question on the five-level page table support patches John Paul Adrian Glaubitz
  2017-04-24  5:11 ` Andy Lutomirski
@ 2017-04-24 16:19 ` Kirill A. Shutemov
  2017-04-24 20:37   ` John Paul Adrian Glaubitz
  1 sibling, 1 reply; 9+ messages in thread
From: Kirill A. Shutemov @ 2017-04-24 16:19 UTC (permalink / raw)
  To: John Paul Adrian Glaubitz
  Cc: Kirill A. Shutemov, linux-kernel, Andi Kleen, Dave Hansen,
	Andy Lutomirski, Michal Hocko, linux-arch, linux-mm

On Sun, Apr 23, 2017 at 12:53:46PM +0200, John Paul Adrian Glaubitz wrote:
> Hi Kirill!
> 
> I recently read the LWN article on your and your colleagues work to
> add five-level page table support for x86 to the Linux kernel [1]
> and I got your email address from the last patch of the series.
> 
> Since this extends the address space beyond 48-bits, as you may know,
> it will cause potential headaches with Javascript engines which use
> tagged pointers. On SPARC, the virtual address space already extends
> to 52 bits and we are running into these very issues with Javascript
> engines on SPARC.
> 
> Now, a possible way to mitigate this problem would be to pass the
> "hint" parameter to mmap() in order to tell the kernel not to allocate
> memory beyond the 48 bits address space. Unfortunately, on Linux this
> will only work when the area pointed to by "hint" is unallocated which
> means one cannot simply use a hardcoded "hint" to mitigate this problem.

In proposed implementation, we also use hint address, but in different
way: by default, if hint address is NULL, kernel would not create mappings
above 47-bits, preserving compatibility.

If an application wants to have access to larger address space, it has to
specify hint addess above 47-bits.

See details here:

http://lkml.kernel.org/r/20170420162147.86517-10-kirill.shutemov@linux.intel.com

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question on the five-level page table support patches
  2017-04-24 13:03   ` Andi Kleen
@ 2017-04-24 20:33     ` John Paul Adrian Glaubitz
  0 siblings, 0 replies; 9+ messages in thread
From: John Paul Adrian Glaubitz @ 2017-04-24 20:33 UTC (permalink / raw)
  To: Andi Kleen, Andy Lutomirski
  Cc: Kirill A. Shutemov, linux-kernel, Dave Hansen, Michal Hocko,
	linux-arch, linux-mm

On 04/24/2017 03:03 PM, Andi Kleen wrote:
> In old Linux hint was a search hint, so if there isn't a hole
> at the hinted area it will search starting from there for a hole
> instead of giving up immediately.

Yep, that's what I meant. It used to work like that and it still
works like that on NetBSD, for example. Although it has apparently
been a long time since it changed [1].

> Now it just gives up, which means every user has to implement
> their own search.

Correct. And the resulting code is usually ugly and inefficient [2].

> Yes I ran into the same problem and it's annoying. It broke
> originally when top down mmap was added I believe
> 
> Before the augmented rbtree it was potentially very expensive, but now
> it should be cheap.

I'm not sure whether I understand what that means.

Thanks,
Adrian

> [1] http://lkml.iu.edu/hypermail/linux/kernel/0305.2/0828.html
> [2] https://hg.mozilla.org/mozilla-central/rev/dfaafbaaa291

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz@debian.org
`. `'   Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question on the five-level page table support patches
  2017-04-24 16:19 ` Kirill A. Shutemov
@ 2017-04-24 20:37   ` John Paul Adrian Glaubitz
  2017-04-24 22:01     ` Kirill A. Shutemov
  2017-04-24 22:09     ` David Miller
  0 siblings, 2 replies; 9+ messages in thread
From: John Paul Adrian Glaubitz @ 2017-04-24 20:37 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Kirill A. Shutemov, linux-kernel, Andi Kleen, Dave Hansen,
	Andy Lutomirski, Michal Hocko, linux-arch, linux-mm

On 04/24/2017 06:19 PM, Kirill A. Shutemov wrote:
> In proposed implementation, we also use hint address, but in different
> way: by default, if hint address is NULL, kernel would not create mappings
> above 47-bits, preserving compatibility.

Ooooh, that would solve a lot of problems actually if it were to be available
on all architectures. On SPARC, the situation is really annoying and I have
been discussing a solution with the Qt developers and they suggested a
similar approach, just one that would also apply to brk() [1].

> If an application wants to have access to larger address space, it has to
> specify hint addess above 47-bits.
> 
> See details here:
> 
> http://lkml.kernel.org/r/20170420162147.86517-10-kirill.shutemov@linux.intel.com

Thanks. I'll have a read. Although from your message I'm reading out that
this particular proposal got rejected.

Would be really nice to able to have a canonical solution for this issue,
it's been biting us on SPARC for quite a while now due to the fact that
virtual address space has been 52 bits on SPARC for a while now.

Adrian

> [1] https://bugreports.qt.io/browse/QTBUG-56264

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz@debian.org
`. `'   Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question on the five-level page table support patches
  2017-04-24 20:37   ` John Paul Adrian Glaubitz
@ 2017-04-24 22:01     ` Kirill A. Shutemov
  2017-04-24 22:09     ` David Miller
  1 sibling, 0 replies; 9+ messages in thread
From: Kirill A. Shutemov @ 2017-04-24 22:01 UTC (permalink / raw)
  To: John Paul Adrian Glaubitz
  Cc: Kirill A. Shutemov, linux-kernel, Andi Kleen, Dave Hansen,
	Andy Lutomirski, Michal Hocko, linux-arch, linux-mm

On Mon, Apr 24, 2017 at 10:37:40PM +0200, John Paul Adrian Glaubitz wrote:
> On 04/24/2017 06:19 PM, Kirill A. Shutemov wrote:
> > In proposed implementation, we also use hint address, but in different
> > way: by default, if hint address is NULL, kernel would not create mappings
> > above 47-bits, preserving compatibility.
> 
> Ooooh, that would solve a lot of problems actually if it were to be available
> on all architectures. On SPARC, the situation is really annoying and I have
> been discussing a solution with the Qt developers and they suggested a
> similar approach, just one that would also apply to brk() [1].
> 
> > If an application wants to have access to larger address space, it has to
> > specify hint addess above 47-bits.
> > 
> > See details here:
> > 
> > http://lkml.kernel.org/r/20170420162147.86517-10-kirill.shutemov@linux.intel.com
> 
> Thanks. I'll have a read. Although from your message I'm reading out that
> this particular proposal got rejected.

No. I just wasn't applied yet, so situation may change.

> Would be really nice to able to have a canonical solution for this issue,
> it's been biting us on SPARC for quite a while now due to the fact that
> virtual address space has been 52 bits on SPARC for a while now.

Power folks are going to implement similar approach. I don't see why Sparc
can't go the same route.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question on the five-level page table support patches
  2017-04-24 20:37   ` John Paul Adrian Glaubitz
  2017-04-24 22:01     ` Kirill A. Shutemov
@ 2017-04-24 22:09     ` David Miller
  2017-04-25  7:25       ` Jon Masters
  1 sibling, 1 reply; 9+ messages in thread
From: David Miller @ 2017-04-24 22:09 UTC (permalink / raw)
  To: glaubitz
  Cc: kirill, kirill.shutemov, linux-kernel, ak, dave.hansen, luto,
	mhocko, linux-arch, linux-mm

From: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Date: Mon, 24 Apr 2017 22:37:40 +0200

> Would be really nice to able to have a canonical solution for this issue,
> it's been biting us on SPARC for quite a while now due to the fact that
> virtual address space has been 52 bits on SPARC for a while now.

It's going to break again with things like ADI which encode protection
keys in the high bits of the 64-bit virtual address.

Reallly, it would be nice if these tags were instead encoded in the
low bits of suitably aligned memory allocations but I am sure it's to
late to do that now.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question on the five-level page table support patches
  2017-04-24 22:09     ` David Miller
@ 2017-04-25  7:25       ` Jon Masters
  0 siblings, 0 replies; 9+ messages in thread
From: Jon Masters @ 2017-04-25  7:25 UTC (permalink / raw)
  To: David Miller, glaubitz
  Cc: kirill, kirill.shutemov, linux-kernel, ak, dave.hansen, luto,
	mhocko, linux-arch, linux-mm

On 04/24/2017 06:09 PM, David Miller wrote:
> From: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
> Date: Mon, 24 Apr 2017 22:37:40 +0200
> 
>> Would be really nice to able to have a canonical solution for this issue,
>> it's been biting us on SPARC for quite a while now due to the fact that
>> virtual address space has been 52 bits on SPARC for a while now.
> 
> It's going to break again with things like ADI which encode protection
> keys in the high bits of the 64-bit virtual address.
> 
> Reallly, it would be nice if these tags were instead encoded in the
> low bits of suitably aligned memory allocations but I am sure it's to
> late to do that now.

I'm curious (and hey, ARM has 52-bit VAs coming[0] that was added in
ARMv8.2). Does anyone really think pointer tagging is a good idea for a
new architecture being created going forward? This could be archived
somewhere so that the folks in Berkeley and elsewhere have an answer.

As an aside, one of the reasons I've been tracking these Intel patches
personally is to figure out the best way to play out the ARMv8 story.
There isn't the same legacy of precompiled code out there (and the
things that broke and were fixed when moving from 42-bit to 48-bit VA
are already accounting for a later switch to 52-bit). I do find it
amusing that I proposed a solution similar Kirill's a year or so back to
some other folks elsewhere with a similar set of goals in mind.

Jon.

[0] Requires 64K pages on ARMv8. It's one of the previously unmentioned
reasons why RHEL for ARM was built with 64K granule size ;)

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-04-25  7:25 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-23 10:53 Question on the five-level page table support patches John Paul Adrian Glaubitz
2017-04-24  5:11 ` Andy Lutomirski
2017-04-24 13:03   ` Andi Kleen
2017-04-24 20:33     ` John Paul Adrian Glaubitz
2017-04-24 16:19 ` Kirill A. Shutemov
2017-04-24 20:37   ` John Paul Adrian Glaubitz
2017-04-24 22:01     ` Kirill A. Shutemov
2017-04-24 22:09     ` David Miller
2017-04-25  7:25       ` Jon Masters

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).