* [RFC PATCH] mmap, aslr: do not enforce legacy mmap on unlimited stacks @ 2017-06-14 8:22 Michal Hocko 2017-06-23 8:46 ` Michal Hocko 2017-06-23 14:02 ` [tip:x86/mm] x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap tip-bot for Michal Hocko 0 siblings, 2 replies; 9+ messages in thread From: Michal Hocko @ 2017-06-14 8:22 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar Cc: Jiri Kosina, Andi Kleen, H. Peter Anvin, LKML, linux-mm, x86, Michal Hocko From: Michal Hocko <mhocko@suse.com> Since cc503c1b43e0 ("x86: PIE executable randomization") we treat applications with RLIMIT_STACK configured to unlimited as legacy and so we a) set the mmap_base to 1/3 of address space + randomization and b) mmap from bottom to top. This makes some sense as it allows the stack to grow really large. On the other hand it reduces the address space usable for default mmaps (wihout address hint) quite a lot. We have received a bug report that SAP HANA workload has hit into this limitation. We could argue that the user just got what he asked for when setting up the unlimited stack but to be realistic growing stack up to 1/6 TASK_SIZE (allowed by mmap_base) is pretty much unimited in the real life. This would give mmap 20TB of additional address space which is quite nice. Especially when it is much more likely to use that address space than the reserved stack. Digging into the history the original implementation of the randomization 8817210d4d96 ("[PATCH] x86_64: Flexmap for 32bit and randomized mappings for 64bit") didn't have this restriction. Signed-off-by: Michal Hocko <mhocko@suse.com> --- Hi, I am sending this as a RFC because I am not really sure how to deal with this. We might as well ignore the reported issue and claim "do not use unlimited stacks" and be done with it. I just stroke me as an unexpected behavior. arch/x86/mm/mmap.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c index 19ad095b41df..797295e792b2 100644 --- a/arch/x86/mm/mmap.c +++ b/arch/x86/mm/mmap.c @@ -74,9 +74,6 @@ static int mmap_is_legacy(void) if (current->personality & ADDR_COMPAT_LAYOUT) return 1; - if (rlimit(RLIMIT_STACK) == RLIM_INFINITY) - return 1; - return sysctl_legacy_va_layout; } -- 2.11.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC PATCH] mmap, aslr: do not enforce legacy mmap on unlimited stacks 2017-06-14 8:22 [RFC PATCH] mmap, aslr: do not enforce legacy mmap on unlimited stacks Michal Hocko @ 2017-06-23 8:46 ` Michal Hocko 2017-06-23 14:02 ` [tip:x86/mm] x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap tip-bot for Michal Hocko 1 sibling, 0 replies; 9+ messages in thread From: Michal Hocko @ 2017-06-23 8:46 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar Cc: Jiri Kosina, Andi Kleen, H. Peter Anvin, LKML, linux-mm, x86 ping? On Wed 14-06-17 10:22:18, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.com> > > Since cc503c1b43e0 ("x86: PIE executable randomization") we treat > applications with RLIMIT_STACK configured to unlimited as legacy > and so we a) set the mmap_base to 1/3 of address space + randomization > and b) mmap from bottom to top. This makes some sense as it allows the > stack to grow really large. On the other hand it reduces the address > space usable for default mmaps (wihout address hint) quite a lot. We > have received a bug report that SAP HANA workload has hit into this > limitation. > > We could argue that the user just got what he asked for when setting > up the unlimited stack but to be realistic growing stack up to 1/6 > TASK_SIZE (allowed by mmap_base) is pretty much unimited in the real > life. This would give mmap 20TB of additional address space which is > quite nice. Especially when it is much more likely to use that address > space than the reserved stack. > > Digging into the history the original implementation of the > randomization 8817210d4d96 ("[PATCH] x86_64: Flexmap for 32bit and > randomized mappings for 64bit") didn't have this restriction. > > Signed-off-by: Michal Hocko <mhocko@suse.com> > --- > > Hi, > I am sending this as a RFC because I am not really sure how to deal with > this. We might as well ignore the reported issue and claim "do not use > unlimited stacks" and be done with it. I just stroke me as an unexpected > behavior. > > arch/x86/mm/mmap.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c > index 19ad095b41df..797295e792b2 100644 > --- a/arch/x86/mm/mmap.c > +++ b/arch/x86/mm/mmap.c > @@ -74,9 +74,6 @@ static int mmap_is_legacy(void) > if (current->personality & ADDR_COMPAT_LAYOUT) > return 1; > > - if (rlimit(RLIMIT_STACK) == RLIM_INFINITY) > - return 1; > - > return sysctl_legacy_va_layout; > } > > -- > 2.11.0 > -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 9+ messages in thread
* [tip:x86/mm] x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap 2017-06-14 8:22 [RFC PATCH] mmap, aslr: do not enforce legacy mmap on unlimited stacks Michal Hocko 2017-06-23 8:46 ` Michal Hocko @ 2017-06-23 14:02 ` tip-bot for Michal Hocko 2017-06-23 14:54 ` Oleg Nesterov ` (2 more replies) 1 sibling, 3 replies; 9+ messages in thread From: tip-bot for Michal Hocko @ 2017-06-23 14:02 UTC (permalink / raw) To: linux-tip-commits Cc: torvalds, mingo, hpa, jkosina, mhocko, oleg, tglx, davej, peterz, linux-kernel Commit-ID: 86b110d2ae6365ce91cabd37588bc8611770421a Gitweb: http://git.kernel.org/tip/86b110d2ae6365ce91cabd37588bc8611770421a Author: Michal Hocko <mhocko@suse.com> AuthorDate: Wed, 14 Jun 2017 10:22:18 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Fri, 23 Jun 2017 11:02:01 +0200 x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap Since the following commit in 2008: cc503c1b43e0 ("x86: PIE executable randomization") We added a heuristics to treat applications with RLIMIT_STACK configured to unlimited as legacy. This means: a) set the mmap_base to 1/3 of address space + randomization and b) mmap from bottom to top. This makes some sense as it allows the stack to grow really large. On the other hand it reduces the address space usable for default mmaps (without address hint) quite a lot. We have received a bug report that SAP HANA workload has hit into this limitation. We could argue that the user just got what he asked for when setting up the unlimited stack but to be realistic growing stack up to 1/6 TASK_SIZE (allowed by mmap_base) is pretty much unimited in the real life. This would give mmap 20TB of additional address space which is quite nice. Especially when it is much more likely to use that address space than the reserved stack. Digging into the history the original implementation of the randomization: 8817210d4d96 ("[PATCH] x86_64: Flexmap for 32bit and randomized mappings for 64bit") didn't have this restriction. So let's try and remove this assumption - hopefully nothing breaks. Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: akpm@linux-foundation.org Cc: hughd@google.com Cc: linux-mm@kvack.org Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/20170614082218.12450-1-mhocko@kernel.org [ So I've applied this to tip:x86/mm with a wider Cc: list - if anyone objects to this change please holler. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/mm/mmap.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c index 19ad095..797295e 100644 --- a/arch/x86/mm/mmap.c +++ b/arch/x86/mm/mmap.c @@ -74,9 +74,6 @@ static int mmap_is_legacy(void) if (current->personality & ADDR_COMPAT_LAYOUT) return 1; - if (rlimit(RLIMIT_STACK) == RLIM_INFINITY) - return 1; - return sysctl_legacy_va_layout; } ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [tip:x86/mm] x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap 2017-06-23 14:02 ` [tip:x86/mm] x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap tip-bot for Michal Hocko @ 2017-06-23 14:54 ` Oleg Nesterov 2017-06-27 8:00 ` Jiri Kosina 2017-06-23 20:35 ` Jiri Kosina 2017-06-24 6:43 ` tip-bot for Michal Hocko 2 siblings, 1 reply; 9+ messages in thread From: Oleg Nesterov @ 2017-06-23 14:54 UTC (permalink / raw) To: tip-bot for Michal Hocko Cc: linux-tip-commits, torvalds, mingo, hpa, jkosina, mhocko, tglx, davej, peterz, linux-kernel On 06/23, tip-bot for Michal Hocko wrote: > > We added a heuristics to treat applications with RLIMIT_STACK configured > to unlimited as legacy. This means: To me this also means a minor security problem. The comment above PER_CLEAR_ON_SETID says "must be cleared upon setuid or setgid exec", but if you do "ulimit -s unlimited" before suid exec then ADDR_COMPAT_LAYOUT set by security checks will be ignored. > So let's try and remove this assumption - hopefully nothing breaks. Agreed. Oleg. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [tip:x86/mm] x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap 2017-06-23 14:54 ` Oleg Nesterov @ 2017-06-27 8:00 ` Jiri Kosina 2017-06-27 14:22 ` Oleg Nesterov 0 siblings, 1 reply; 9+ messages in thread From: Jiri Kosina @ 2017-06-27 8:00 UTC (permalink / raw) To: Oleg Nesterov Cc: tip-bot for Michal Hocko, linux-tip-commits, torvalds, mingo, hpa, mhocko, tglx, davej, peterz, linux-kernel On Fri, 23 Jun 2017, Oleg Nesterov wrote: > > We added a heuristics to treat applications with RLIMIT_STACK configured > > to unlimited as legacy. This means: > > To me this also means a minor security problem. The comment above > PER_CLEAR_ON_SETID says "must be cleared upon setuid or setgid exec", > but if you do "ulimit -s unlimited" before suid exec then > ADDR_COMPAT_LAYOUT set by security checks will be ignored. Could you please be a bit more specific here? mmap_is_legacy() *first* checks for the ADDR_COMPAT_LAYOUT in the personality flags, and only then, if it's unset, RLIMIT_STACK comes to play. Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [tip:x86/mm] x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap 2017-06-27 8:00 ` Jiri Kosina @ 2017-06-27 14:22 ` Oleg Nesterov 2017-06-28 9:40 ` Jiri Kosina 0 siblings, 1 reply; 9+ messages in thread From: Oleg Nesterov @ 2017-06-27 14:22 UTC (permalink / raw) To: Jiri Kosina Cc: tip-bot for Michal Hocko, linux-tip-commits, torvalds, mingo, hpa, mhocko, tglx, davej, peterz, linux-kernel On 06/27, Jiri Kosina wrote: > > On Fri, 23 Jun 2017, Oleg Nesterov wrote: > > > > We added a heuristics to treat applications with RLIMIT_STACK configured > > > to unlimited as legacy. This means: > > > > To me this also means a minor security problem. The comment above > > PER_CLEAR_ON_SETID says "must be cleared upon setuid or setgid exec", > > but if you do "ulimit -s unlimited" before suid exec then > > ADDR_COMPAT_LAYOUT set by security checks will be ignored. > > Could you please be a bit more specific here? > > mmap_is_legacy() *first* checks for the ADDR_COMPAT_LAYOUT in the > personality flags, and only then, if it's unset, RLIMIT_STACK comes to > play. Yes, and this means that even if ADDR_COMPAT_LAYOUT was cleared by current->personality &= ~bprm->per_clear; in flush_old_exec() mmap_is_legacy() still returns true if rlimit(STACK) == INFINITY. IOW. Say, in case of suid exec bprm_fill_uid() sets bprm->per_clear = PER_CLEAR_ON_SETID which includes ADDR_COMPAT_LAYOUT. To me, this means that we do not want the legacy layout after suid exec, but "ulimit -s unlimited" can be used to break the rule. And let me quote my "rlimits && suid exec" email I sent some time before... Imo RLIMITs are almost pointless security-wise, but now it seems to me they can harm. Say, the comment above PER_CLEAR_ON_SETID says "must be cleared upon setuid or setgid exec" and this mask includes ADDR_COMPAT_LAYOUT. OK, this makes sense, but this doesn't really work because you can just do "$ ulimit -s unlimited" before suid exec and this will make mmap_is_legacy() return true. Of course, only if rlim_max=RLIM_INFINITY, but afaik usually this is true. Or you can lower RLIMIT_STACK to make suid app crash inside some "system critical" section... And even if we forget about the potential security impact, isn't it strange that suid exec inherits RLIMITs from non-root process? Perhaps it makes sense to reset RLIMITs on suid exec (say, if bprm->per_clear is not zero) ? Yes, it is not clear how should we define SANE_RLIMITS_FOR_SUID, and this should probably depend on sysctl, etc. Oleg. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [tip:x86/mm] x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap 2017-06-27 14:22 ` Oleg Nesterov @ 2017-06-28 9:40 ` Jiri Kosina 0 siblings, 0 replies; 9+ messages in thread From: Jiri Kosina @ 2017-06-28 9:40 UTC (permalink / raw) To: Oleg Nesterov Cc: tip-bot for Michal Hocko, linux-tip-commits, torvalds, mingo, hpa, mhocko, tglx, davej, peterz, linux-kernel On Tue, 27 Jun 2017, Oleg Nesterov wrote: > Perhaps it makes sense to reset RLIMITs on suid exec (say, if > bprm->per_clear is not zero) ? Yes, it is not clear how should we define > SANE_RLIMITS_FOR_SUID, and this should probably depend on sysctl, etc. Hmm, this should be an userspace-defined policy. On a 'standard' (PAM-based) system, I think a sane expectation would be to get the same limits as the ones enforced by pam_limits configuration, but syncing those with kernel feels awkward. Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [tip:x86/mm] x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap 2017-06-23 14:02 ` [tip:x86/mm] x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap tip-bot for Michal Hocko 2017-06-23 14:54 ` Oleg Nesterov @ 2017-06-23 20:35 ` Jiri Kosina 2017-06-24 6:43 ` tip-bot for Michal Hocko 2 siblings, 0 replies; 9+ messages in thread From: Jiri Kosina @ 2017-06-23 20:35 UTC (permalink / raw) To: Linus Torvalds, mingo, Michal Hocko, hpa, oleg, Thomas Gleixner, davej, peterz, linux-kernel Cc: linux-tip-commits On Fri, 23 Jun 2017, tip-bot for Michal Hocko wrote: > TASK_SIZE (allowed by mmap_base) is pretty much unimited in the real > life. This would give mmap 20TB of additional address space which is > quite nice. Especially when it is much more likely to use that address > space than the reserved stack. > > Digging into the history the original implementation of the randomization: > > 8817210d4d96 ("[PATCH] x86_64: Flexmap for 32bit and randomized mappings for 64bit") > > didn't have this restriction. > > So let's try and remove this assumption - hopefully nothing breaks. > > Signed-off-by: Michal Hocko <mhocko@suse.com> > Cc: Dave Jones <davej@codemonkey.org.uk> > Cc: Jiri Kosina <jkosina@suse.cz> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Oleg Nesterov <oleg@redhat.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: akpm@linux-foundation.org > Cc: hughd@google.com > Cc: linux-mm@kvack.org > Cc: will.deacon@arm.com > Link: http://lkml.kernel.org/r/20170614082218.12450-1-mhocko@kernel.org > [ So I've applied this to tip:x86/mm with a wider Cc: list - if anyone objects to this change please holler. ] > Signed-off-by: Ingo Molnar <mingo@kernel.org> > --- > arch/x86/mm/mmap.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c > index 19ad095..797295e 100644 > --- a/arch/x86/mm/mmap.c > +++ b/arch/x86/mm/mmap.c > @@ -74,9 +74,6 @@ static int mmap_is_legacy(void) > if (current->personality & ADDR_COMPAT_LAYOUT) > return 1; > > - if (rlimit(RLIMIT_STACK) == RLIM_INFINITY) > - return 1; > - >From the person who actually added this particular restriction Acked-by: Jiri Kosina <jkosina@suse.cz> I think it was some historical remnant from the times when 32/64 arch code split still existed, but can't really recall any details after all those years; it doesn't make sense to me any more. Thanks, -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 9+ messages in thread
* [tip:x86/mm] x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap 2017-06-23 14:02 ` [tip:x86/mm] x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap tip-bot for Michal Hocko 2017-06-23 14:54 ` Oleg Nesterov 2017-06-23 20:35 ` Jiri Kosina @ 2017-06-24 6:43 ` tip-bot for Michal Hocko 2 siblings, 0 replies; 9+ messages in thread From: tip-bot for Michal Hocko @ 2017-06-24 6:43 UTC (permalink / raw) To: linux-tip-commits Cc: hpa, mingo, oleg, jkosina, davej, linux-kernel, mhocko, tglx, peterz, torvalds Commit-ID: 4a06370bcb674af88679a4f2c5c87c3e40688935 Gitweb: http://git.kernel.org/tip/4a06370bcb674af88679a4f2c5c87c3e40688935 Author: Michal Hocko <mhocko@suse.com> AuthorDate: Wed, 14 Jun 2017 10:22:18 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Sat, 24 Jun 2017 08:39:16 +0200 x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap Since the following commit in 2008: cc503c1b43e0 ("x86: PIE executable randomization") We added a heuristics to treat applications with RLIMIT_STACK configured to unlimited as legacy. This means: a) set the mmap_base to 1/3 of address space + randomization and b) mmap from bottom to top. This makes some sense as it allows the stack to grow really large. On the other hand it reduces the address space usable for default mmaps (without address hint) quite a lot. We have received a bug report that SAP HANA workload has hit into this limitation. We could argue that the user just got what he asked for when setting up the unlimited stack but to be realistic growing stack up to 1/6 TASK_SIZE (allowed by mmap_base) is pretty much unimited in the real life. This would give mmap 20TB of additional address space which is quite nice. Especially when it is much more likely to use that address space than the reserved stack. Digging into the history the original implementation of the randomization: 8817210d4d96 ("[PATCH] x86_64: Flexmap for 32bit and randomized mappings for 64bit") didn't have this restriction. So let's try and remove this assumption - hopefully nothing breaks. Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: akpm@linux-foundation.org Cc: hughd@google.com Cc: linux-mm@kvack.org Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/tip-86b110d2ae6365ce91cabd37588bc8611770421a@git.kernel.org [ So I've applied this to tip:x86/mm with a wider Cc: list - if anyone objects to this change please holler. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/mm/mmap.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c index 19ad095..797295e 100644 --- a/arch/x86/mm/mmap.c +++ b/arch/x86/mm/mmap.c @@ -74,9 +74,6 @@ static int mmap_is_legacy(void) if (current->personality & ADDR_COMPAT_LAYOUT) return 1; - if (rlimit(RLIMIT_STACK) == RLIM_INFINITY) - return 1; - return sysctl_legacy_va_layout; } ^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-06-28 9:40 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-06-14 8:22 [RFC PATCH] mmap, aslr: do not enforce legacy mmap on unlimited stacks Michal Hocko 2017-06-23 8:46 ` Michal Hocko 2017-06-23 14:02 ` [tip:x86/mm] x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap tip-bot for Michal Hocko 2017-06-23 14:54 ` Oleg Nesterov 2017-06-27 8:00 ` Jiri Kosina 2017-06-27 14:22 ` Oleg Nesterov 2017-06-28 9:40 ` Jiri Kosina 2017-06-23 20:35 ` Jiri Kosina 2017-06-24 6:43 ` tip-bot for Michal Hocko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).