From mboxrd@z Thu Jan 1 00:00:00 1970 From: Linus Torvalds Subject: Re: [PATCHv3 33/33] mm, x86: introduce PR_SET_MAX_VADDR and PR_GET_MAX_VADDR Date: Fri, 17 Feb 2017 15:21:27 -0800 Message-ID: References: <20170217141328.164563-1-kirill.shutemov@linux.intel.com> <20170217141328.164563-34-kirill.shutemov@linux.intel.com> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=001a113d328a51fe850548c22c1c Return-path: In-Reply-To: Sender: owner-linux-mm@kvack.org To: Andy Lutomirski Cc: Thomas Gleixner , Ingo Molnar , Andrew Morton , "linux-arch@vger.kernel.org" , Linux API , the arch/x86 maintainers , Andi Kleen , "Kirill A. Shutemov" , Arnd Bergmann , Dave Hansen , Linux Kernel Mailing List , Catalin Marinas , "H. Peter Anvin" , linux-mm List-Id: linux-api@vger.kernel.org --001a113d328a51fe850548c22c1c Content-Type: text/plain; charset=UTF-8 On Feb 17, 2017 3:02 PM, "Andy Lutomirski" wrote: What I'm trying to say is: if we're going to do the route of 48-bit limit unless a specific mmap call requests otherwise, can we at least have an interface that doesn't suck? No, I'm not suggesting specific mmap calls at all. I'm suggesting the complete opposite: not having some magical "max address" at all in the VM layer. Keep all the existing TASK_SIZE defines as-is, and just make those be the new 56-bit limit. But to then not make most processes use it, just make the default x86 arch_get_free_area() return an address limited to the old 47-bit limit. So effectively all legacy programs work exactly the same way they always did. Then there are escape mechanisms: the process control that expands that x86 arch_get_free_area() to give high addresses. That would be the normal thing. But also, exactly *because* we don't make all those TASK_SIZE changes, you could - if you wanted to - use MAP_FIXED to just allocate directly in high virtual space. For example, maybe you just make your own private memory allocator do that, and all the normal stuff would just continue to use the low virtual addresses, and you wouldn't even bother with the prctl(). Because let's face it, the number of processes that will want the high virtual addresses are going to be fairly few and specialised. Maybe even those will want it only for special things (like mapping a huge area of nonvolatile memory) So I'm saying: - don't do all these magical TASK_SIZE things at all - don't need with generic mm code at all. - only change arch_get_free_area() to take one single process control issue into account. Keep it simple and stupid, and don't make this address side expansion something that the core mm code needs to even know about. Linus --001a113d328a51fe850548c22c1c Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


On Feb 17, 2017 3:02 PM, "Andy Lutomirski" <luto@amacapital.net> wrote:

What I'm trying to say is: if we're going to do the route of = 48-bit
limit unless a specific mmap call requests otherwise, can we at least
have an interface that doesn't suck?

= No, I'm not suggesting specific mmap calls at all. I'm suggesting t= he complete opposite: not having some magical "max address" at al= l in the VM layer. Keep all the existing TASK_SIZE defines as-is, and just = make those be the new 56-bit limit.

But to then not mak= e most processes use it, just make the default x86 arch_get_free_area() ret= urn an address limited to the old 47-bit limit. So effectively all legacy p= rograms work exactly the same way they always did.

Then= there are escape mechanisms: the process control that expands that x86=C2= =A0arch_get_free_area() to give high= addresses. That would be the normal thing.

But also, exactly *because* we don't make all those=C2=A0<= span style=3D"font-family:sans-serif">TASK_SIZE changes, you could - if you= wanted to - use MAP_FIXED to just allocate directly in high virtual space.= For example, maybe you just make your own private memory allocator do that= , and all the normal stuff would just continue to use the low virtual addre= sses, and you wouldn't even bother with the prctl().
Because let's face it, the number of processes that = will want the high virtual addresses are going to be fairly few and special= ised. Maybe even those will want it only for special things (like mapping a= huge area of nonvolatile memory)

So I= 'm saying:

=C2=A0- don't do al= l these magical TASK_SIZE things at all

=C2=A0- don't need with generic mm code at all.

<= /span>
=C2=A0- only change=C2=A0arch_get_free_area() to take one single process control issue i= nto account.

Keep it simple and stupid, and don'= ;t make this address side expansion something that the core mm code needs t= o even know about.

=C2=A0 =C2=A0 Linus

--001a113d328a51fe850548c22c1c-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org