* [PATCH] mm: Add additional consistency check @ 2017-03-31 16:40 Kees Cook 2017-03-31 21:33 ` Andrew Morton ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Kees Cook @ 2017-03-31 16:40 UTC (permalink / raw) To: Andrew Morton Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim, linux-mm, linux-kernel As found in PaX, this adds a cheap check on heap consistency, just to notice if things have gotten corrupted in the page lookup. Signed-off-by: Kees Cook <keescook@chromium.org> --- mm/slab.h | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/slab.h b/mm/slab.h index 65e7c3fcac72..64447640b70c 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -384,6 +384,7 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x) return s; page = virt_to_head_page(x); + BUG_ON(!PageSlab(page)); cachep = page->slab_cache; if (slab_equal_or_root(cachep, s)) return cachep; -- 2.7.4 -- Kees Cook Pixel Security ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-03-31 16:40 [PATCH] mm: Add additional consistency check Kees Cook @ 2017-03-31 21:33 ` Andrew Morton 2017-04-01 0:04 ` Kees Cook 2017-04-04 11:30 ` Michal Hocko 2017-04-11 18:30 ` Christoph Lameter 2 siblings, 1 reply; 41+ messages in thread From: Andrew Morton @ 2017-03-31 21:33 UTC (permalink / raw) To: Kees Cook Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim, linux-mm, linux-kernel On Fri, 31 Mar 2017 09:40:28 -0700 Kees Cook <keescook@chromium.org> wrote: > As found in PaX, this adds a cheap check on heap consistency, just to > notice if things have gotten corrupted in the page lookup. "As found in PaX" isn't a very illuminating justification for such a change. Was there a real kernel bug which this would have exposed, or what? > --- a/mm/slab.h > +++ b/mm/slab.h > @@ -384,6 +384,7 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x) > return s; > > page = virt_to_head_page(x); > + BUG_ON(!PageSlab(page)); > cachep = page->slab_cache; > if (slab_equal_or_root(cachep, s)) > return cachep; BUG_ON might be too severe. I expect the kindest VM_WARN_ON_ONCE() would suffice here, but without more details it is hard to say. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-03-31 21:33 ` Andrew Morton @ 2017-04-01 0:04 ` Kees Cook 2017-04-03 3:40 ` Michael Ellerman 0 siblings, 1 reply; 41+ messages in thread From: Kees Cook @ 2017-04-01 0:04 UTC (permalink / raw) To: Andrew Morton Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Fri, Mar 31, 2017 at 2:33 PM, Andrew Morton <akpm@linux-foundation.org> wrote: > On Fri, 31 Mar 2017 09:40:28 -0700 Kees Cook <keescook@chromium.org> wrote: > >> As found in PaX, this adds a cheap check on heap consistency, just to >> notice if things have gotten corrupted in the page lookup. > > "As found in PaX" isn't a very illuminating justification for such a > change. Was there a real kernel bug which this would have exposed, or > what? I don't know off the top of my head, but given the kinds of heap attacks I've been seeing, I think this added consistency check is worth it given how inexpensive it is. When heap metadata gets corrupted, we can get into nasty side-effects that can be attacker-controlled, so better to catch obviously bad states as early as possible. >> --- a/mm/slab.h >> +++ b/mm/slab.h >> @@ -384,6 +384,7 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x) >> return s; >> >> page = virt_to_head_page(x); >> + BUG_ON(!PageSlab(page)); >> cachep = page->slab_cache; >> if (slab_equal_or_root(cachep, s)) >> return cachep; > > BUG_ON might be too severe. I expect the kindest VM_WARN_ON_ONCE() > would suffice here, but without more details it is hard to say. So, WARN isn't enough to protect the kernel (execution continues and the memory is still dereferenced for malicious purposes, etc). Perhaps use CHECK_DATA_CORRUPTION() here, which can either WARN and take a "safe" path, or BUG (depending on config paranoia of the builder). I've got a series adding it in a number of other places, so I could add this patch to that series? -Kees -- Kees Cook Pixel Security ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-01 0:04 ` Kees Cook @ 2017-04-03 3:40 ` Michael Ellerman 2017-04-03 14:03 ` Christoph Lameter 0 siblings, 1 reply; 41+ messages in thread From: Michael Ellerman @ 2017-04-03 3:40 UTC (permalink / raw) To: Kees Cook, Andrew Morton Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML Kees Cook <keescook@chromium.org> writes: > On Fri, Mar 31, 2017 at 2:33 PM, Andrew Morton > <akpm@linux-foundation.org> wrote: >> On Fri, 31 Mar 2017 09:40:28 -0700 Kees Cook <keescook@chromium.org> wrote: >> >>> As found in PaX, this adds a cheap check on heap consistency, just to >>> notice if things have gotten corrupted in the page lookup. >> >> "As found in PaX" isn't a very illuminating justification for such a >> change. Was there a real kernel bug which this would have exposed, or >> what? > > I don't know off the top of my head, but given the kinds of heap > attacks I've been seeing, I think this added consistency check is > worth it given how inexpensive it is. When heap metadata gets > corrupted, we can get into nasty side-effects that can be > attacker-controlled, so better to catch obviously bad states as early > as possible. There's your changelog :) >>> --- a/mm/slab.h >>> +++ b/mm/slab.h >>> @@ -384,6 +384,7 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x) >>> return s; >>> >>> page = virt_to_head_page(x); >>> + BUG_ON(!PageSlab(page)); >>> cachep = page->slab_cache; >>> if (slab_equal_or_root(cachep, s)) >>> return cachep; >> >> BUG_ON might be too severe. I expect the kindest VM_WARN_ON_ONCE() >> would suffice here, but without more details it is hard to say. > > So, WARN isn't enough to protect the kernel (execution continues and > the memory is still dereferenced for malicious purposes, etc). You could do: if (WARN_ON(!PageSlab(page))) return NULL. Though I see at least two callers that don't check for a NULL return. Looking at the context, the tail of the function already contains: pr_err("%s: Wrong slab cache. %s but object is from %s\n", __func__, s->name, cachep->name); WARN_ON_ONCE(1); return s; } At least in slab.c it seems that would allow you to "free" an object from one kmem_cache onto the array_cache of another kmem_cache, which seems fishy. But maybe there's a check somewhere I'm missing? cheers ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-03 3:40 ` Michael Ellerman @ 2017-04-03 14:03 ` Christoph Lameter 2017-04-03 14:53 ` Matthew Wilcox 0 siblings, 1 reply; 41+ messages in thread From: Christoph Lameter @ 2017-04-03 14:03 UTC (permalink / raw) To: Michael Ellerman Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Mon, 3 Apr 2017, Michael Ellerman wrote: > At least in slab.c it seems that would allow you to "free" an object > from one kmem_cache onto the array_cache of another kmem_cache, which > seems fishy. But maybe there's a check somewhere I'm missing? kfree can be used to free any object from any slab cache. kmem_cache_free() checks if the object belongs to the cache given. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-03 14:03 ` Christoph Lameter @ 2017-04-03 14:53 ` Matthew Wilcox 0 siblings, 0 replies; 41+ messages in thread From: Matthew Wilcox @ 2017-04-03 14:53 UTC (permalink / raw) To: Christoph Lameter Cc: Michael Ellerman, Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Mon, Apr 03, 2017 at 09:03:50AM -0500, Christoph Lameter wrote: > On Mon, 3 Apr 2017, Michael Ellerman wrote: > > > At least in slab.c it seems that would allow you to "free" an object > > from one kmem_cache onto the array_cache of another kmem_cache, which > > seems fishy. But maybe there's a check somewhere I'm missing? > > kfree can be used to free any object from any slab cache. Is that a guarantee? There's some wording in the RCU free code that seems to indicate we can't rely on that being true. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-03-31 16:40 [PATCH] mm: Add additional consistency check Kees Cook 2017-03-31 21:33 ` Andrew Morton @ 2017-04-04 11:30 ` Michal Hocko 2017-04-04 15:07 ` Christoph Lameter 2017-04-27 12:06 ` Michal Hocko 2017-04-11 18:30 ` Christoph Lameter 2 siblings, 2 replies; 41+ messages in thread From: Michal Hocko @ 2017-04-04 11:30 UTC (permalink / raw) To: Kees Cook Cc: Andrew Morton, Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim, linux-mm, linux-kernel On Fri 31-03-17 09:40:28, Kees Cook wrote: > As found in PaX, this adds a cheap check on heap consistency, just to > notice if things have gotten corrupted in the page lookup. > > Signed-off-by: Kees Cook <keescook@chromium.org> NAK without a proper changelog. Seriously, we do not blindly apply changes from other projects without a deep understanding of all consequences. > --- > mm/slab.h | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/slab.h b/mm/slab.h > index 65e7c3fcac72..64447640b70c 100644 > --- a/mm/slab.h > +++ b/mm/slab.h > @@ -384,6 +384,7 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x) > return s; > > page = virt_to_head_page(x); > + BUG_ON(!PageSlab(page)); > cachep = page->slab_cache; > if (slab_equal_or_root(cachep, s)) > return cachep; > -- > 2.7.4 > > > -- > Kees Cook > Pixel Security > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-04 11:30 ` Michal Hocko @ 2017-04-04 15:07 ` Christoph Lameter 2017-04-04 15:16 ` Michal Hocko 2017-04-27 12:06 ` Michal Hocko 1 sibling, 1 reply; 41+ messages in thread From: Christoph Lameter @ 2017-04-04 15:07 UTC (permalink / raw) To: Michal Hocko Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, linux-mm, linux-kernel On Tue, 4 Apr 2017, Michal Hocko wrote: > NAK without a proper changelog. Seriously, we do not blindly apply > changes from other projects without a deep understanding of all > consequences. Functionalitywise this is trivial. A page must be a slab page in order to be able to determine the slab cache of an object. Its definitely not ok if the page is not a slab page. The main issue that may exist here is the adding of overhead to a critical code path like kfree(). ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-04 15:07 ` Christoph Lameter @ 2017-04-04 15:16 ` Michal Hocko 2017-04-04 15:46 ` Kees Cook 2017-04-04 19:13 ` Christoph Lameter 0 siblings, 2 replies; 41+ messages in thread From: Michal Hocko @ 2017-04-04 15:16 UTC (permalink / raw) To: Christoph Lameter Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, linux-mm, linux-kernel On Tue 04-04-17 10:07:23, Cristopher Lameter wrote: > On Tue, 4 Apr 2017, Michal Hocko wrote: > > > NAK without a proper changelog. Seriously, we do not blindly apply > > changes from other projects without a deep understanding of all > > consequences. > > Functionalitywise this is trivial. A page must be a slab page in order to > be able to determine the slab cache of an object. Its definitely not ok if > the page is not a slab page. Yes, but we do not have to blow the kernel, right? Why cannot we simply leak that memory? > The main issue that may exist here is the adding of overhead to a critical > code path like kfree(). Yes, nothing is for free. But if the attack space is real then we probably want to sacrifice few cycles (to simply return ASAP without further further processing). This all should be in the changelog ideally with some numbers. I suspect this would be hard to measure in most workloads. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-04 15:16 ` Michal Hocko @ 2017-04-04 15:46 ` Kees Cook 2017-04-04 15:58 ` Michal Hocko 2017-04-04 19:13 ` Christoph Lameter 1 sibling, 1 reply; 41+ messages in thread From: Kees Cook @ 2017-04-04 15:46 UTC (permalink / raw) To: Michal Hocko Cc: Christoph Lameter, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, Apr 4, 2017 at 8:16 AM, Michal Hocko <mhocko@kernel.org> wrote: > On Tue 04-04-17 10:07:23, Cristopher Lameter wrote: >> On Tue, 4 Apr 2017, Michal Hocko wrote: >> >> > NAK without a proper changelog. Seriously, we do not blindly apply >> > changes from other projects without a deep understanding of all >> > consequences. >> >> Functionalitywise this is trivial. A page must be a slab page in order to >> be able to determine the slab cache of an object. Its definitely not ok if >> the page is not a slab page. > > Yes, but we do not have to blow the kernel, right? Why cannot we simply > leak that memory? I can put this behind CHECK_DATA_CORRUPTION() instead of BUG(), which allows the system builder to choose between WARN and BUG. Some people absolutely want the kernel to BUG on data corruption as it could be an attack. >> The main issue that may exist here is the adding of overhead to a critical >> code path like kfree(). > > Yes, nothing is for free. But if the attack space is real then we > probably want to sacrifice few cycles (to simply return ASAP without > further further processing). This all should be in the changelog ideally > with some numbers. I suspect this would be hard to measure in most > workloads. Given the trivial nature of the check, yeah, it seemed impossible to actually show performance changes. -Kees -- Kees Cook Pixel Security ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-04 15:46 ` Kees Cook @ 2017-04-04 15:58 ` Michal Hocko 2017-04-04 16:02 ` Kees Cook 0 siblings, 1 reply; 41+ messages in thread From: Michal Hocko @ 2017-04-04 15:58 UTC (permalink / raw) To: Kees Cook Cc: Christoph Lameter, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue 04-04-17 08:46:02, Kees Cook wrote: > On Tue, Apr 4, 2017 at 8:16 AM, Michal Hocko <mhocko@kernel.org> wrote: > > On Tue 04-04-17 10:07:23, Cristopher Lameter wrote: > >> On Tue, 4 Apr 2017, Michal Hocko wrote: > >> > >> > NAK without a proper changelog. Seriously, we do not blindly apply > >> > changes from other projects without a deep understanding of all > >> > consequences. > >> > >> Functionalitywise this is trivial. A page must be a slab page in order to > >> be able to determine the slab cache of an object. Its definitely not ok if > >> the page is not a slab page. > > > > Yes, but we do not have to blow the kernel, right? Why cannot we simply > > leak that memory? > > I can put this behind CHECK_DATA_CORRUPTION() instead of BUG(), which > allows the system builder to choose between WARN and BUG. Some people > absolutely want the kernel to BUG on data corruption as it could be an > attack. CHECK_DATA_CORRUPTION sounds as better fit to me. This would, however require to handle the potenial corruption by returning and leaking the memory. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-04 15:58 ` Michal Hocko @ 2017-04-04 16:02 ` Kees Cook 0 siblings, 0 replies; 41+ messages in thread From: Kees Cook @ 2017-04-04 16:02 UTC (permalink / raw) To: Michal Hocko Cc: Christoph Lameter, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, Apr 4, 2017 at 8:58 AM, Michal Hocko <mhocko@kernel.org> wrote: > On Tue 04-04-17 08:46:02, Kees Cook wrote: >> On Tue, Apr 4, 2017 at 8:16 AM, Michal Hocko <mhocko@kernel.org> wrote: >> > On Tue 04-04-17 10:07:23, Cristopher Lameter wrote: >> >> On Tue, 4 Apr 2017, Michal Hocko wrote: >> >> >> >> > NAK without a proper changelog. Seriously, we do not blindly apply >> >> > changes from other projects without a deep understanding of all >> >> > consequences. >> >> >> >> Functionalitywise this is trivial. A page must be a slab page in order to >> >> be able to determine the slab cache of an object. Its definitely not ok if >> >> the page is not a slab page. >> > >> > Yes, but we do not have to blow the kernel, right? Why cannot we simply >> > leak that memory? >> >> I can put this behind CHECK_DATA_CORRUPTION() instead of BUG(), which >> allows the system builder to choose between WARN and BUG. Some people >> absolutely want the kernel to BUG on data corruption as it could be an >> attack. > > CHECK_DATA_CORRUPTION sounds as better fit to me. This would, however > require to handle the potenial corruption by returning and leaking the > memory. IIUC, that would be the "return s" path? I should likely change the WARN_ON_ONCE there to be CHECK_DATA_CORRUPTION too. I'll add this to my series. -Kees -- Kees Cook Pixel Security ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-04 15:16 ` Michal Hocko 2017-04-04 15:46 ` Kees Cook @ 2017-04-04 19:13 ` Christoph Lameter 2017-04-04 19:42 ` Michal Hocko 1 sibling, 1 reply; 41+ messages in thread From: Christoph Lameter @ 2017-04-04 19:13 UTC (permalink / raw) To: Michal Hocko Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, linux-mm, linux-kernel On Tue, 4 Apr 2017, Michal Hocko wrote: > Yes, but we do not have to blow the kernel, right? Why cannot we simply > leak that memory? Because it is a serious bug to attempt to free a non slab object using slab operations. This is often the result of memory corruption, coding errs etc. The system needs to stop right there. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-04 19:13 ` Christoph Lameter @ 2017-04-04 19:42 ` Michal Hocko 2017-04-04 19:58 ` Christoph Lameter 0 siblings, 1 reply; 41+ messages in thread From: Michal Hocko @ 2017-04-04 19:42 UTC (permalink / raw) To: Christoph Lameter Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, linux-mm, linux-kernel On Tue 04-04-17 14:13:06, Cristopher Lameter wrote: > On Tue, 4 Apr 2017, Michal Hocko wrote: > > > Yes, but we do not have to blow the kernel, right? Why cannot we simply > > leak that memory? > > Because it is a serious bug to attempt to free a non slab object using > slab operations. This is often the result of memory corruption, coding > errs etc. The system needs to stop right there. Why when an alternative is a memory leak? -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-04 19:42 ` Michal Hocko @ 2017-04-04 19:58 ` Christoph Lameter 2017-04-04 20:13 ` Michal Hocko 0 siblings, 1 reply; 41+ messages in thread From: Christoph Lameter @ 2017-04-04 19:58 UTC (permalink / raw) To: Michal Hocko Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, linux-mm, linux-kernel On Tue, 4 Apr 2017, Michal Hocko wrote: > On Tue 04-04-17 14:13:06, Cristopher Lameter wrote: > > On Tue, 4 Apr 2017, Michal Hocko wrote: > > > > > Yes, but we do not have to blow the kernel, right? Why cannot we simply > > > leak that memory? > > > > Because it is a serious bug to attempt to free a non slab object using > > slab operations. This is often the result of memory corruption, coding > > errs etc. The system needs to stop right there. > > Why when an alternative is a memory leak? Because the slab allocators fail also in case you free an object multiple times etc etc. Continuation is supported by enabling a special resiliency feature via the kernel command line. The alternative is selectable but not the default. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-04 19:58 ` Christoph Lameter @ 2017-04-04 20:13 ` Michal Hocko 2017-04-11 4:58 ` Kees Cook 0 siblings, 1 reply; 41+ messages in thread From: Michal Hocko @ 2017-04-04 20:13 UTC (permalink / raw) To: Christoph Lameter Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, linux-mm, linux-kernel On Tue 04-04-17 14:58:06, Cristopher Lameter wrote: > On Tue, 4 Apr 2017, Michal Hocko wrote: > > > On Tue 04-04-17 14:13:06, Cristopher Lameter wrote: > > > On Tue, 4 Apr 2017, Michal Hocko wrote: > > > > > > > Yes, but we do not have to blow the kernel, right? Why cannot we simply > > > > leak that memory? > > > > > > Because it is a serious bug to attempt to free a non slab object using > > > slab operations. This is often the result of memory corruption, coding > > > errs etc. The system needs to stop right there. > > > > Why when an alternative is a memory leak? > > Because the slab allocators fail also in case you free an object multiple > times etc etc. Continuation is supported by enabling a special resiliency > feature via the kernel command line. The alternative is selectable but not > the default. I disagree! We should try to continue as long as we _know_ that the internal state of the allocator is still consistent and a further operation will not spread the corruption even more. This is clearly not the case for an invalid pointer to kfree. I can see why checking for an early allocator corruption is not always feasible and you can only detect after-the-fact but this is not the case here and putting your system down just because some buggy code is trying to free something it hasn't allocated is not really useful. I completely agree with Linus that we overuse BUG way too much and this is just another example of it. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-04 20:13 ` Michal Hocko @ 2017-04-11 4:58 ` Kees Cook 2017-04-11 13:46 ` Michal Hocko 0 siblings, 1 reply; 41+ messages in thread From: Kees Cook @ 2017-04-11 4:58 UTC (permalink / raw) To: Michal Hocko Cc: Christoph Lameter, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, Apr 4, 2017 at 1:13 PM, Michal Hocko <mhocko@kernel.org> wrote: > On Tue 04-04-17 14:58:06, Cristopher Lameter wrote: >> On Tue, 4 Apr 2017, Michal Hocko wrote: >> >> > On Tue 04-04-17 14:13:06, Cristopher Lameter wrote: >> > > On Tue, 4 Apr 2017, Michal Hocko wrote: >> > > >> > > > Yes, but we do not have to blow the kernel, right? Why cannot we simply >> > > > leak that memory? >> > > >> > > Because it is a serious bug to attempt to free a non slab object using >> > > slab operations. This is often the result of memory corruption, coding >> > > errs etc. The system needs to stop right there. >> > >> > Why when an alternative is a memory leak? >> >> Because the slab allocators fail also in case you free an object multiple >> times etc etc. Continuation is supported by enabling a special resiliency >> feature via the kernel command line. The alternative is selectable but not >> the default. > > I disagree! We should try to continue as long as we _know_ that the > internal state of the allocator is still consistent and a further > operation will not spread the corruption even more. This is clearly not > the case for an invalid pointer to kfree. > > I can see why checking for an early allocator corruption is not always > feasible and you can only detect after-the-fact but this is not the case > here and putting your system down just because some buggy code is trying > to free something it hasn't allocated is not really useful. I completely > agree with Linus that we overuse BUG way too much and this is just > another example of it. Instead of the proposed BUG here, what's the correct "safe" return value? -Kees -- Kees Cook Pixel Security ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 4:58 ` Kees Cook @ 2017-04-11 13:46 ` Michal Hocko 2017-04-11 14:14 ` Kees Cook 0 siblings, 1 reply; 41+ messages in thread From: Michal Hocko @ 2017-04-11 13:46 UTC (permalink / raw) To: Kees Cook Cc: Christoph Lameter, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Mon 10-04-17 21:58:22, Kees Cook wrote: > On Tue, Apr 4, 2017 at 1:13 PM, Michal Hocko <mhocko@kernel.org> wrote: > > On Tue 04-04-17 14:58:06, Cristopher Lameter wrote: > >> On Tue, 4 Apr 2017, Michal Hocko wrote: > >> > >> > On Tue 04-04-17 14:13:06, Cristopher Lameter wrote: > >> > > On Tue, 4 Apr 2017, Michal Hocko wrote: > >> > > > >> > > > Yes, but we do not have to blow the kernel, right? Why cannot we simply > >> > > > leak that memory? > >> > > > >> > > Because it is a serious bug to attempt to free a non slab object using > >> > > slab operations. This is often the result of memory corruption, coding > >> > > errs etc. The system needs to stop right there. > >> > > >> > Why when an alternative is a memory leak? > >> > >> Because the slab allocators fail also in case you free an object multiple > >> times etc etc. Continuation is supported by enabling a special resiliency > >> feature via the kernel command line. The alternative is selectable but not > >> the default. > > > > I disagree! We should try to continue as long as we _know_ that the > > internal state of the allocator is still consistent and a further > > operation will not spread the corruption even more. This is clearly not > > the case for an invalid pointer to kfree. > > > > I can see why checking for an early allocator corruption is not always > > feasible and you can only detect after-the-fact but this is not the case > > here and putting your system down just because some buggy code is trying > > to free something it hasn't allocated is not really useful. I completely > > agree with Linus that we overuse BUG way too much and this is just > > another example of it. > > Instead of the proposed BUG here, what's the correct "safe" return value? I would assume that _you_ as the one who proposes the change would take some time to read and understand the code and know this answer. This is how we do changes to the kernel: have an objective, understand the code and generate the patch. I am really sad that this particular patch has shown that you didn't bother to consider the later part and blindly applied something that you haven't thought through properly. Please try harder next time. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 13:46 ` Michal Hocko @ 2017-04-11 14:14 ` Kees Cook 2017-04-11 14:19 ` Michal Hocko 0 siblings, 1 reply; 41+ messages in thread From: Kees Cook @ 2017-04-11 14:14 UTC (permalink / raw) To: Michal Hocko Cc: Christoph Lameter, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, Apr 11, 2017 at 6:46 AM, Michal Hocko <mhocko@kernel.org> wrote: > On Mon 10-04-17 21:58:22, Kees Cook wrote: >> On Tue, Apr 4, 2017 at 1:13 PM, Michal Hocko <mhocko@kernel.org> wrote: >> > On Tue 04-04-17 14:58:06, Cristopher Lameter wrote: >> >> On Tue, 4 Apr 2017, Michal Hocko wrote: >> >> >> >> > On Tue 04-04-17 14:13:06, Cristopher Lameter wrote: >> >> > > On Tue, 4 Apr 2017, Michal Hocko wrote: >> >> > > >> >> > > > Yes, but we do not have to blow the kernel, right? Why cannot we simply >> >> > > > leak that memory? >> >> > > >> >> > > Because it is a serious bug to attempt to free a non slab object using >> >> > > slab operations. This is often the result of memory corruption, coding >> >> > > errs etc. The system needs to stop right there. >> >> > >> >> > Why when an alternative is a memory leak? >> >> >> >> Because the slab allocators fail also in case you free an object multiple >> >> times etc etc. Continuation is supported by enabling a special resiliency >> >> feature via the kernel command line. The alternative is selectable but not >> >> the default. >> > >> > I disagree! We should try to continue as long as we _know_ that the >> > internal state of the allocator is still consistent and a further >> > operation will not spread the corruption even more. This is clearly not >> > the case for an invalid pointer to kfree. >> > >> > I can see why checking for an early allocator corruption is not always >> > feasible and you can only detect after-the-fact but this is not the case >> > here and putting your system down just because some buggy code is trying >> > to free something it hasn't allocated is not really useful. I completely >> > agree with Linus that we overuse BUG way too much and this is just >> > another example of it. >> >> Instead of the proposed BUG here, what's the correct "safe" return value? > > I would assume that _you_ as the one who proposes the change would take > some time to read and understand the code and know this answer. This is > how we do changes to the kernel: have an objective, understand the code > and generate the patch. > > I am really sad that this particular patch has shown that you didn't > bother to consider the later part and blindly applied something that you > haven't thought through properly. Please try harder next time. Our objectives are different: I want the kernel to immediately stop when corruption is detected. Since others are interested in making it survivable, I was hoping to get a hint about what such an improvement would look like. Instead this condescending attitude, can you instead provide constructive help that will get our users closer to the safe kernel operation we're all interested in? -Kees -- Kees Cook Pixel Security ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 14:14 ` Kees Cook @ 2017-04-11 14:19 ` Michal Hocko 2017-04-11 16:05 ` Kees Cook ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Michal Hocko @ 2017-04-11 14:19 UTC (permalink / raw) To: Kees Cook Cc: Christoph Lameter, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue 11-04-17 07:14:01, Kees Cook wrote: > On Tue, Apr 11, 2017 at 6:46 AM, Michal Hocko <mhocko@kernel.org> wrote: > > On Mon 10-04-17 21:58:22, Kees Cook wrote: > >> On Tue, Apr 4, 2017 at 1:13 PM, Michal Hocko <mhocko@kernel.org> wrote: > >> > On Tue 04-04-17 14:58:06, Cristopher Lameter wrote: > >> >> On Tue, 4 Apr 2017, Michal Hocko wrote: > >> >> > >> >> > On Tue 04-04-17 14:13:06, Cristopher Lameter wrote: > >> >> > > On Tue, 4 Apr 2017, Michal Hocko wrote: > >> >> > > > >> >> > > > Yes, but we do not have to blow the kernel, right? Why cannot we simply > >> >> > > > leak that memory? > >> >> > > > >> >> > > Because it is a serious bug to attempt to free a non slab object using > >> >> > > slab operations. This is often the result of memory corruption, coding > >> >> > > errs etc. The system needs to stop right there. > >> >> > > >> >> > Why when an alternative is a memory leak? > >> >> > >> >> Because the slab allocators fail also in case you free an object multiple > >> >> times etc etc. Continuation is supported by enabling a special resiliency > >> >> feature via the kernel command line. The alternative is selectable but not > >> >> the default. > >> > > >> > I disagree! We should try to continue as long as we _know_ that the > >> > internal state of the allocator is still consistent and a further > >> > operation will not spread the corruption even more. This is clearly not > >> > the case for an invalid pointer to kfree. > >> > > >> > I can see why checking for an early allocator corruption is not always > >> > feasible and you can only detect after-the-fact but this is not the case > >> > here and putting your system down just because some buggy code is trying > >> > to free something it hasn't allocated is not really useful. I completely > >> > agree with Linus that we overuse BUG way too much and this is just > >> > another example of it. > >> > >> Instead of the proposed BUG here, what's the correct "safe" return value? > > > > I would assume that _you_ as the one who proposes the change would take > > some time to read and understand the code and know this answer. This is > > how we do changes to the kernel: have an objective, understand the code > > and generate the patch. > > > > I am really sad that this particular patch has shown that you didn't > > bother to consider the later part and blindly applied something that you > > haven't thought through properly. Please try harder next time. > > Our objectives are different: I want the kernel to immediately stop > when corruption is detected. Since others are interested in making it > survivable, I was hoping to get a hint about what such an improvement > would look like. I do not think sprinkling BUG_ONs will help that objective. And BUG_ON under IRQ disable is likely not helping an error survivable... > Instead this condescending attitude, can you instead > provide constructive help that will get our users closer to the safe > kernel operation we're all interested in? I would do something like... --- diff --git a/mm/slab.c b/mm/slab.c index bd63450a9b16..87c99a5e9e18 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -393,10 +393,15 @@ static inline void set_store_user_dirty(struct kmem_cache *cachep) {} static int slab_max_order = SLAB_MAX_ORDER_LO; static bool slab_max_order_set __initdata; +static inline struct kmem_cache *page_to_cache(struct page *page) +{ + return page->slab_cache; +} + static inline struct kmem_cache *virt_to_cache(const void *obj) { struct page *page = virt_to_head_page(obj); - return page->slab_cache; + return page_to_cache(page); } static inline void *index_to_obj(struct kmem_cache *cache, struct page *page, @@ -3813,14 +3818,18 @@ void kfree(const void *objp) { struct kmem_cache *c; unsigned long flags; + struct page *page; trace_kfree(_RET_IP_, objp); if (unlikely(ZERO_OR_NULL_PTR(objp))) return; + page = virt_to_head_page(obj); + if (CHECK_DATA_CORRUPTION(!PageSlab(page))) + return; local_irq_save(flags); kfree_debugcheck(objp); - c = virt_to_cache(objp); + c = page_to_cache(page); debug_check_no_locks_freed(objp, c->object_size); debug_check_no_obj_freed(objp, c->object_size); -- Michal Hocko SUSE Labs ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 14:19 ` Michal Hocko @ 2017-04-11 16:05 ` Kees Cook 2017-04-11 16:16 ` Christoph Lameter 2017-04-28 1:11 ` Kees Cook 2 siblings, 0 replies; 41+ messages in thread From: Kees Cook @ 2017-04-11 16:05 UTC (permalink / raw) To: Michal Hocko Cc: Christoph Lameter, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, Apr 11, 2017 at 7:19 AM, Michal Hocko <mhocko@kernel.org> wrote: > On Tue 11-04-17 07:14:01, Kees Cook wrote: >> On Tue, Apr 11, 2017 at 6:46 AM, Michal Hocko <mhocko@kernel.org> wrote: >> > On Mon 10-04-17 21:58:22, Kees Cook wrote: >> >> On Tue, Apr 4, 2017 at 1:13 PM, Michal Hocko <mhocko@kernel.org> wrote: >> >> > On Tue 04-04-17 14:58:06, Cristopher Lameter wrote: >> >> >> On Tue, 4 Apr 2017, Michal Hocko wrote: >> >> >> >> >> >> > On Tue 04-04-17 14:13:06, Cristopher Lameter wrote: >> >> >> > > On Tue, 4 Apr 2017, Michal Hocko wrote: >> >> >> > > >> >> >> > > > Yes, but we do not have to blow the kernel, right? Why cannot we simply >> >> >> > > > leak that memory? >> >> >> > > >> >> >> > > Because it is a serious bug to attempt to free a non slab object using >> >> >> > > slab operations. This is often the result of memory corruption, coding >> >> >> > > errs etc. The system needs to stop right there. >> >> >> > >> >> >> > Why when an alternative is a memory leak? >> >> >> >> >> >> Because the slab allocators fail also in case you free an object multiple >> >> >> times etc etc. Continuation is supported by enabling a special resiliency >> >> >> feature via the kernel command line. The alternative is selectable but not >> >> >> the default. >> >> > >> >> > I disagree! We should try to continue as long as we _know_ that the >> >> > internal state of the allocator is still consistent and a further >> >> > operation will not spread the corruption even more. This is clearly not >> >> > the case for an invalid pointer to kfree. >> >> > >> >> > I can see why checking for an early allocator corruption is not always >> >> > feasible and you can only detect after-the-fact but this is not the case >> >> > here and putting your system down just because some buggy code is trying >> >> > to free something it hasn't allocated is not really useful. I completely >> >> > agree with Linus that we overuse BUG way too much and this is just >> >> > another example of it. >> >> >> >> Instead of the proposed BUG here, what's the correct "safe" return value? >> > >> > I would assume that _you_ as the one who proposes the change would take >> > some time to read and understand the code and know this answer. This is >> > how we do changes to the kernel: have an objective, understand the code >> > and generate the patch. >> > >> > I am really sad that this particular patch has shown that you didn't >> > bother to consider the later part and blindly applied something that you >> > haven't thought through properly. Please try harder next time. >> >> Our objectives are different: I want the kernel to immediately stop >> when corruption is detected. Since others are interested in making it >> survivable, I was hoping to get a hint about what such an improvement >> would look like. > > I do not think sprinkling BUG_ONs will help that objective. And BUG_ON > under IRQ disable is likely not helping an error survivable... Yes, agreed. Handling it cleanly is always better. >> Instead this condescending attitude, can you instead >> provide constructive help that will get our users closer to the safe >> kernel operation we're all interested in? > > I would do something like... > --- > diff --git a/mm/slab.c b/mm/slab.c > index bd63450a9b16..87c99a5e9e18 100644 > --- a/mm/slab.c > +++ b/mm/slab.c > @@ -393,10 +393,15 @@ static inline void set_store_user_dirty(struct kmem_cache *cachep) {} > static int slab_max_order = SLAB_MAX_ORDER_LO; > static bool slab_max_order_set __initdata; > > +static inline struct kmem_cache *page_to_cache(struct page *page) > +{ > + return page->slab_cache; > +} > + > static inline struct kmem_cache *virt_to_cache(const void *obj) > { > struct page *page = virt_to_head_page(obj); > - return page->slab_cache; > + return page_to_cache(page); > } > > static inline void *index_to_obj(struct kmem_cache *cache, struct page *page, > @@ -3813,14 +3818,18 @@ void kfree(const void *objp) > { > struct kmem_cache *c; > unsigned long flags; > + struct page *page; > > trace_kfree(_RET_IP_, objp); > > if (unlikely(ZERO_OR_NULL_PTR(objp))) > return; > + page = virt_to_head_page(obj); > + if (CHECK_DATA_CORRUPTION(!PageSlab(page))) > + return; > local_irq_save(flags); > kfree_debugcheck(objp); > - c = virt_to_cache(objp); > + c = page_to_cache(page); > debug_check_no_locks_freed(objp, c->object_size); > > debug_check_no_obj_freed(objp, c->object_size); Awesome! Thank you very much! I'll play with this. -Kees -- Kees Cook Pixel Security ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 14:19 ` Michal Hocko 2017-04-11 16:05 ` Kees Cook @ 2017-04-11 16:16 ` Christoph Lameter 2017-04-11 16:19 ` Kees Cook 2017-04-11 16:41 ` Michal Hocko 2017-04-28 1:11 ` Kees Cook 2 siblings, 2 replies; 41+ messages in thread From: Christoph Lameter @ 2017-04-11 16:16 UTC (permalink / raw) To: Michal Hocko Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, 11 Apr 2017, Michal Hocko wrote: > static inline void *index_to_obj(struct kmem_cache *cache, struct page *page, > @@ -3813,14 +3818,18 @@ void kfree(const void *objp) > { > struct kmem_cache *c; > unsigned long flags; > + struct page *page; > > trace_kfree(_RET_IP_, objp); > > if (unlikely(ZERO_OR_NULL_PTR(objp))) > return; > + page = virt_to_head_page(obj); > + if (CHECK_DATA_CORRUPTION(!PageSlab(page))) There is a flag SLAB_DEBUG_OBJECTS that is available for this check. Consistency checks are configuraable in the slab allocator. Mentioned that before and got this lecture about data consistency checks. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 16:16 ` Christoph Lameter @ 2017-04-11 16:19 ` Kees Cook 2017-04-11 16:23 ` Christoph Lameter 2017-04-11 16:26 ` Christoph Lameter 2017-04-11 16:41 ` Michal Hocko 1 sibling, 2 replies; 41+ messages in thread From: Kees Cook @ 2017-04-11 16:19 UTC (permalink / raw) To: Christoph Lameter Cc: Michal Hocko, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, Apr 11, 2017 at 9:16 AM, Christoph Lameter <cl@linux.com> wrote: > On Tue, 11 Apr 2017, Michal Hocko wrote: > >> static inline void *index_to_obj(struct kmem_cache *cache, struct page *page, >> @@ -3813,14 +3818,18 @@ void kfree(const void *objp) >> { >> struct kmem_cache *c; >> unsigned long flags; >> + struct page *page; >> >> trace_kfree(_RET_IP_, objp); >> >> if (unlikely(ZERO_OR_NULL_PTR(objp))) >> return; >> + page = virt_to_head_page(obj); >> + if (CHECK_DATA_CORRUPTION(!PageSlab(page))) > > There is a flag SLAB_DEBUG_OBJECTS that is available for this check. > Consistency checks are configuraable in the slab allocator. > > Mentioned that before and got this lecture about data consistency checks. It seems that enabling the debug checks comes with a non-trivial performance impact. I'd like to see consistency checks by default so we can handle intentional heap corruption attacks better. This check isn't expensive... -Kees -- Kees Cook Pixel Security ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 16:19 ` Kees Cook @ 2017-04-11 16:23 ` Christoph Lameter 2017-04-11 16:30 ` Kees Cook 2017-04-11 16:26 ` Christoph Lameter 1 sibling, 1 reply; 41+ messages in thread From: Christoph Lameter @ 2017-04-11 16:23 UTC (permalink / raw) To: Kees Cook Cc: Michal Hocko, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, 11 Apr 2017, Kees Cook wrote: > It seems that enabling the debug checks comes with a non-trivial > performance impact. I'd like to see consistency checks by default so > we can handle intentional heap corruption attacks better. This check > isn't expensive... Its in a very hot code and frequently used code path. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 16:23 ` Christoph Lameter @ 2017-04-11 16:30 ` Kees Cook 0 siblings, 0 replies; 41+ messages in thread From: Kees Cook @ 2017-04-11 16:30 UTC (permalink / raw) To: Christoph Lameter Cc: Michal Hocko, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, Apr 11, 2017 at 9:23 AM, Christoph Lameter <cl@linux.com> wrote: > On Tue, 11 Apr 2017, Kees Cook wrote: > >> It seems that enabling the debug checks comes with a non-trivial >> performance impact. I'd like to see consistency checks by default so >> we can handle intentional heap corruption attacks better. This check >> isn't expensive... > > Its in a very hot code and frequently used code path. Yeah, absolutely. All the more reason to make sure the kernel can't be attacked through it. :) As with the automotive industry analogy[1] from Konstantin, we need to make sure Linux not only run fast and efficiently, but also fails gracefully by default. > Note also that these checks can be enabled and disabled at runtime for > each slab cache. Correct, but my understanding is that enabling them through the debug system ends up being much more expensive than this smaller check. The debug code is fairly comprehensive, but it's not been designed for efficient attack detection, etc. -Kees [1] http://kernsec.org/files/lss2015/giant-bags-of-mostly-water.pdf -- Kees Cook Pixel Security ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 16:19 ` Kees Cook 2017-04-11 16:23 ` Christoph Lameter @ 2017-04-11 16:26 ` Christoph Lameter 1 sibling, 0 replies; 41+ messages in thread From: Christoph Lameter @ 2017-04-11 16:26 UTC (permalink / raw) To: Kees Cook Cc: Michal Hocko, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, 11 Apr 2017, Kees Cook wrote: > It seems that enabling the debug checks comes with a non-trivial > performance impact. I'd like to see consistency checks by default so > we can handle intentional heap corruption attacks better. This check > isn't expensive... Note also that these checks can be enabled and disabled at runtime for each slab cache. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 16:16 ` Christoph Lameter 2017-04-11 16:19 ` Kees Cook @ 2017-04-11 16:41 ` Michal Hocko 2017-04-11 18:03 ` Christoph Lameter 1 sibling, 1 reply; 41+ messages in thread From: Michal Hocko @ 2017-04-11 16:41 UTC (permalink / raw) To: Christoph Lameter Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue 11-04-17 11:16:42, Cristopher Lameter wrote: > On Tue, 11 Apr 2017, Michal Hocko wrote: > > > static inline void *index_to_obj(struct kmem_cache *cache, struct page *page, > > @@ -3813,14 +3818,18 @@ void kfree(const void *objp) > > { > > struct kmem_cache *c; > > unsigned long flags; > > + struct page *page; > > > > trace_kfree(_RET_IP_, objp); > > > > if (unlikely(ZERO_OR_NULL_PTR(objp))) > > return; > > + page = virt_to_head_page(obj); > > + if (CHECK_DATA_CORRUPTION(!PageSlab(page))) > > There is a flag SLAB_DEBUG_OBJECTS that is available for this check. Which is way too late, at least for the kfree path. page->slab_cache on anything else than PageSlab is just a garbage. And my understanding of the patch objective is to stop those from happening. > Consistency checks are configuraable in the slab allocator. and they have to be compiled in (at least for SLAB) AFAIR. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 16:41 ` Michal Hocko @ 2017-04-11 18:03 ` Christoph Lameter 2017-04-11 18:30 ` Michal Hocko 0 siblings, 1 reply; 41+ messages in thread From: Christoph Lameter @ 2017-04-11 18:03 UTC (permalink / raw) To: Michal Hocko Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, 11 Apr 2017, Michal Hocko wrote: > > > > There is a flag SLAB_DEBUG_OBJECTS that is available for this check. > > Which is way too late, at least for the kfree path. page->slab_cache > on anything else than PageSlab is just a garbage. And my understanding > of the patch objective is to stop those from happening. We are looking here at SLAB. SLUB code can legitimately have a compound page there because large allocations fallback to the page allocator. Garbage would be attempting to free a page that has !PageSLAB set but also is no compound page. That condition is already checked in kfree() with a BUG_ON() and that BUG_ON has been there for a long time. Certainly we can make SLAB consistent if there is no check there already. Slab just attempts a free on that object which will fail too. So we are already handling that condition. Why change things? Add a BUG_ON if you want to make SLAB consistent. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 18:03 ` Christoph Lameter @ 2017-04-11 18:30 ` Michal Hocko 2017-04-11 18:44 ` Christoph Lameter 0 siblings, 1 reply; 41+ messages in thread From: Michal Hocko @ 2017-04-11 18:30 UTC (permalink / raw) To: Christoph Lameter Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue 11-04-17 13:03:01, Cristopher Lameter wrote: > On Tue, 11 Apr 2017, Michal Hocko wrote: > > > > > > > There is a flag SLAB_DEBUG_OBJECTS that is available for this check. > > > > Which is way too late, at least for the kfree path. page->slab_cache > > on anything else than PageSlab is just a garbage. And my understanding > > of the patch objective is to stop those from happening. > > We are looking here at SLAB. SLUB code can legitimately have a compound > page there because large allocations fallback to the page allocator. > > Garbage would be attempting to free a page that has !PageSLAB set but also > is no compound page. That condition is already checked in kfree() with a > BUG_ON() and that BUG_ON has been there for a long time. Are you talking about SLAB or SLUB here? The only BUG_ON(PageSlab(page)) in SLAB I can see is in kmem_freepages and that is way too late because we already rely on cachep which is not trustworthy. Or am I missing some other place you have in mind? > Certainly we can > make SLAB consistent if there is no check there already. Slab just > attempts a free on that object which will fail too. > > So we are already handling that condition. Why change things? Add a BUG_ON > if you want to make SLAB consistent. I hate to repeat myself but let me do it for the last time in this thread. BUG_ON for something that is recoverable is completely inappropriate. And I consider kfree with a bogus pointer something that we can easily recover from. There are other cases where the internal state of the allocator is compromised to the point where continuing is not possible and BUGing there is acceptable but kfree(garbage) is not that case. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 18:30 ` Michal Hocko @ 2017-04-11 18:44 ` Christoph Lameter 2017-04-11 18:55 ` Michal Hocko 0 siblings, 1 reply; 41+ messages in thread From: Christoph Lameter @ 2017-04-11 18:44 UTC (permalink / raw) To: Michal Hocko Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, 11 Apr 2017, Michal Hocko wrote: > > So we are already handling that condition. Why change things? Add a BUG_ON > > if you want to make SLAB consistent. > > I hate to repeat myself but let me do it for the last time in this > thread. BUG_ON for something that is recoverable is completely > inappropriate. And I consider kfree with a bogus pointer something that > we can easily recover from. There are other cases where the internal > state of the allocator is compromised to the point where continuing is > not possible and BUGing there is acceptable but kfree(garbage) is not > that case. kfree(garbage) by the core kernel has so far been taken as a sign of severe memory corruption and the kernels have been oopsing when this occurred. This has been that way for a decade or so. kfree() is used by the allocators and various other core kernel components. If the metadata of the core kernel is compromised then it is safest to stop right there. If you want to change things then someone has to do some work. What you are saying is not the way things are implemented. Sorry. Making both allocators consistent is ok with me and is a improvement of the code. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 18:44 ` Christoph Lameter @ 2017-04-11 18:55 ` Michal Hocko 2017-04-11 18:59 ` Christoph Lameter 0 siblings, 1 reply; 41+ messages in thread From: Michal Hocko @ 2017-04-11 18:55 UTC (permalink / raw) To: Christoph Lameter Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue 11-04-17 13:44:02, Cristopher Lameter wrote: > On Tue, 11 Apr 2017, Michal Hocko wrote: > > > > So we are already handling that condition. Why change things? Add a BUG_ON > > > if you want to make SLAB consistent. > > > > I hate to repeat myself but let me do it for the last time in this > > thread. BUG_ON for something that is recoverable is completely > > inappropriate. And I consider kfree with a bogus pointer something that > > we can easily recover from. There are other cases where the internal > > state of the allocator is compromised to the point where continuing is > > not possible and BUGing there is acceptable but kfree(garbage) is not > > that case. > > kfree(garbage) by the core kernel has so far been taken as a sign of > severe memory corruption and the kernels have been oopsing when this > occurred. This has been that way for a decade or so. which doesn't make it a valid decision. We just overuse BUG* > kfree() is used by > the allocators and various other core kernel components. If the metadata > of the core kernel is compromised then it is safest to stop right there. > > If you want to change things then someone has to do some work. What you > are saying is not the way things are implemented. Sorry. I didn't say anything like that. Hence the proposed patch which still needs some more thinking and evaluation. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 18:55 ` Michal Hocko @ 2017-04-11 18:59 ` Christoph Lameter 2017-04-11 19:39 ` Michal Hocko 0 siblings, 1 reply; 41+ messages in thread From: Christoph Lameter @ 2017-04-11 18:59 UTC (permalink / raw) To: Michal Hocko Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, 11 Apr 2017, Michal Hocko wrote: > I didn't say anything like that. Hence the proposed patch which still > needs some more thinking and evaluation. This patch does not even affect kfree(). Could you start another discussion thread where you discuss your suggestions for the changes in the allocators and how we could go about this? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 18:59 ` Christoph Lameter @ 2017-04-11 19:39 ` Michal Hocko 2017-04-17 15:22 ` Christoph Lameter 0 siblings, 1 reply; 41+ messages in thread From: Michal Hocko @ 2017-04-11 19:39 UTC (permalink / raw) To: Christoph Lameter Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue 11-04-17 13:59:44, Cristopher Lameter wrote: > On Tue, 11 Apr 2017, Michal Hocko wrote: > > > I didn't say anything like that. Hence the proposed patch which still > > needs some more thinking and evaluation. > > This patch does not even affect kfree(). Ehm? Are we even talking about the same thing? The whole discussion was to catch invalid pointers to _kfree_ and why BUG* is not the best way to handle that. > Could you start another > discussion thread where you discuss your suggestions for the changes in > the allocators and how we could go about this? I presume Kees will pursue http://lkml.kernel.org/r/20170411141956.GP6729@dhcp22.suse.cz or something along those lines. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 19:39 ` Michal Hocko @ 2017-04-17 15:22 ` Christoph Lameter 2017-04-18 6:41 ` Michal Hocko 0 siblings, 1 reply; 41+ messages in thread From: Christoph Lameter @ 2017-04-17 15:22 UTC (permalink / raw) To: Michal Hocko Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, 11 Apr 2017, Michal Hocko wrote: > On Tue 11-04-17 13:59:44, Cristopher Lameter wrote: > > On Tue, 11 Apr 2017, Michal Hocko wrote: > > > > > I didn't say anything like that. Hence the proposed patch which still > > > needs some more thinking and evaluation. > > > > This patch does not even affect kfree(). > > Ehm? Are we even talking about the same thing? The whole discussion was > to catch invalid pointers to _kfree_ and why BUG* is not the best way to > handle that. The patch does not do that. See my review. Invalid points to kfree are already caught with a bug on. See kfree in mm/slub.c ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-17 15:22 ` Christoph Lameter @ 2017-04-18 6:41 ` Michal Hocko 2017-04-18 13:31 ` Christoph Lameter 2017-04-18 13:37 ` Christoph Lameter 0 siblings, 2 replies; 41+ messages in thread From: Michal Hocko @ 2017-04-18 6:41 UTC (permalink / raw) To: Christoph Lameter Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Mon 17-04-17 10:22:29, Cristopher Lameter wrote: > On Tue, 11 Apr 2017, Michal Hocko wrote: > > > On Tue 11-04-17 13:59:44, Cristopher Lameter wrote: > > > On Tue, 11 Apr 2017, Michal Hocko wrote: > > > > > > > I didn't say anything like that. Hence the proposed patch which still > > > > needs some more thinking and evaluation. > > > > > > This patch does not even affect kfree(). > > > > Ehm? Are we even talking about the same thing? The whole discussion was > > to catch invalid pointers to _kfree_ and why BUG* is not the best way to > > handle that. > > The patch does not do that. See my review. Invalid points to kfree are > already caught with a bug on. See kfree in mm/slub.c Are you even reading those emails? First of all we are talking about slab here. Secondly I've already pointed out that the BUG_ON(!PageSlab) in kmem_freepages is already too late because we do operate on a potential garbage from invalid page... -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-18 6:41 ` Michal Hocko @ 2017-04-18 13:31 ` Christoph Lameter 2017-04-18 13:37 ` Christoph Lameter 1 sibling, 0 replies; 41+ messages in thread From: Christoph Lameter @ 2017-04-18 13:31 UTC (permalink / raw) To: Michal Hocko Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, 18 Apr 2017, Michal Hocko wrote: > > The patch does not do that. See my review. Invalid points to kfree are > > already caught with a bug on. See kfree in mm/slub.c > > Are you even reading those emails? First of all we are talking about > slab here. Secondly I've already pointed out that the BUG_ON(!PageSlab) > in kmem_freepages is already too late because we do operate on a > potential garbage from invalid page... Again this is confusing because you are discussing something that was not covered by the patch submitted. Please start another discussion thread on kfree() separately from the discussion of the patch here. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-18 6:41 ` Michal Hocko 2017-04-18 13:31 ` Christoph Lameter @ 2017-04-18 13:37 ` Christoph Lameter 1 sibling, 0 replies; 41+ messages in thread From: Christoph Lameter @ 2017-04-18 13:37 UTC (permalink / raw) To: Michal Hocko Cc: Kees Cook, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, 18 Apr 2017, Michal Hocko wrote: > Are you even reading those emails? First of all we are talking about > slab here. Secondly I've already pointed out that the BUG_ON(!PageSlab) > in kmem_freepages is already too late because we do operate on a > potential garbage from invalid page... Before I forget: 1. The patch affects both slab and slub since it patches mm/slab.h and is called by both allocators. 2. The check in the patch we are discussing here when calling kmem_cache_free() will be executing before kmem_freepages() is called in slab. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-11 14:19 ` Michal Hocko 2017-04-11 16:05 ` Kees Cook 2017-04-11 16:16 ` Christoph Lameter @ 2017-04-28 1:11 ` Kees Cook 2017-04-28 6:16 ` Michal Hocko 2 siblings, 1 reply; 41+ messages in thread From: Kees Cook @ 2017-04-28 1:11 UTC (permalink / raw) To: Michal Hocko Cc: Christoph Lameter, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Tue, Apr 11, 2017 at 7:19 AM, Michal Hocko <mhocko@kernel.org> wrote: > I would do something like... > --- > diff --git a/mm/slab.c b/mm/slab.c > index bd63450a9b16..87c99a5e9e18 100644 > --- a/mm/slab.c > +++ b/mm/slab.c > @@ -393,10 +393,15 @@ static inline void set_store_user_dirty(struct kmem_cache *cachep) {} > static int slab_max_order = SLAB_MAX_ORDER_LO; > static bool slab_max_order_set __initdata; > > +static inline struct kmem_cache *page_to_cache(struct page *page) > +{ > + return page->slab_cache; > +} > + > static inline struct kmem_cache *virt_to_cache(const void *obj) > { > struct page *page = virt_to_head_page(obj); > - return page->slab_cache; > + return page_to_cache(page); > } > > static inline void *index_to_obj(struct kmem_cache *cache, struct page *page, > @@ -3813,14 +3818,18 @@ void kfree(const void *objp) > { > struct kmem_cache *c; > unsigned long flags; > + struct page *page; > > trace_kfree(_RET_IP_, objp); > > if (unlikely(ZERO_OR_NULL_PTR(objp))) > return; > + page = virt_to_head_page(obj); > + if (CHECK_DATA_CORRUPTION(!PageSlab(page))) > + return; > local_irq_save(flags); > kfree_debugcheck(objp); > - c = virt_to_cache(objp); > + c = page_to_cache(page); > debug_check_no_locks_freed(objp, c->object_size); > > debug_check_no_obj_freed(objp, c->object_size); Sorry for the delay, I've finally had time to look at this again. So, this only handles the kfree() case, not the kmem_cache_free() nor kmem_cache_free_bulk() cases, so it misses all the non-kmalloc allocations (and kfree() ultimately calls down to kmem_cache_free()). Similarly, my proposed patch missed the kfree() path. :P As I work on a replacement, is the goal to avoid the checks while under local_irq_save()? (i.e. I can't just put the check in virt_to_cache(), etc.) -Kees -- Kees Cook Pixel Security ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-28 1:11 ` Kees Cook @ 2017-04-28 6:16 ` Michal Hocko 0 siblings, 0 replies; 41+ messages in thread From: Michal Hocko @ 2017-04-28 6:16 UTC (permalink / raw) To: Kees Cook Cc: Christoph Lameter, Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, Linux-MM, LKML On Thu 27-04-17 18:11:28, Kees Cook wrote: > On Tue, Apr 11, 2017 at 7:19 AM, Michal Hocko <mhocko@kernel.org> wrote: > > I would do something like... > > --- > > diff --git a/mm/slab.c b/mm/slab.c > > index bd63450a9b16..87c99a5e9e18 100644 > > --- a/mm/slab.c > > +++ b/mm/slab.c > > @@ -393,10 +393,15 @@ static inline void set_store_user_dirty(struct kmem_cache *cachep) {} > > static int slab_max_order = SLAB_MAX_ORDER_LO; > > static bool slab_max_order_set __initdata; > > > > +static inline struct kmem_cache *page_to_cache(struct page *page) > > +{ > > + return page->slab_cache; > > +} > > + > > static inline struct kmem_cache *virt_to_cache(const void *obj) > > { > > struct page *page = virt_to_head_page(obj); > > - return page->slab_cache; > > + return page_to_cache(page); > > } > > > > static inline void *index_to_obj(struct kmem_cache *cache, struct page *page, > > @@ -3813,14 +3818,18 @@ void kfree(const void *objp) > > { > > struct kmem_cache *c; > > unsigned long flags; > > + struct page *page; > > > > trace_kfree(_RET_IP_, objp); > > > > if (unlikely(ZERO_OR_NULL_PTR(objp))) > > return; > > + page = virt_to_head_page(obj); > > + if (CHECK_DATA_CORRUPTION(!PageSlab(page))) > > + return; > > local_irq_save(flags); > > kfree_debugcheck(objp); > > - c = virt_to_cache(objp); > > + c = page_to_cache(page); > > debug_check_no_locks_freed(objp, c->object_size); > > > > debug_check_no_obj_freed(objp, c->object_size); > > Sorry for the delay, I've finally had time to look at this again. > > So, this only handles the kfree() case, not the kmem_cache_free() nor > kmem_cache_free_bulk() cases, so it misses all the non-kmalloc > allocations (and kfree() ultimately calls down to kmem_cache_free()). > Similarly, my proposed patch missed the kfree() path. :P yes > As I work on a replacement, is the goal to avoid the checks while > under local_irq_save()? (i.e. I can't just put the check in > virt_to_cache(), etc.) You would have to check all callers of virt_to_cache. I would simply replace BUG_ON(!PageSlab()) in cache_from_obj. kmem_cache_free already handles NULL cache. kmem_cache_free_bulk and build_detached_freelist can be made to do so. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-04-04 11:30 ` Michal Hocko 2017-04-04 15:07 ` Christoph Lameter @ 2017-04-27 12:06 ` Michal Hocko 1 sibling, 0 replies; 41+ messages in thread From: Michal Hocko @ 2017-04-27 12:06 UTC (permalink / raw) To: Kees Cook Cc: Andrew Morton, Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim, linux-mm, linux-kernel On Tue 04-04-17 13:30:22, Michal Hocko wrote: > On Fri 31-03-17 09:40:28, Kees Cook wrote: > > As found in PaX, this adds a cheap check on heap consistency, just to > > notice if things have gotten corrupted in the page lookup. > > > > Signed-off-by: Kees Cook <keescook@chromium.org> > > NAK without a proper changelog. Seriously, we do not blindly apply > changes from other projects without a deep understanding of all > consequences. This still seems to be in the mmotm tree http://www.ozlabs.org/~akpm/mmotm/broken-out/mm-add-additional-consistency-check.patch I hope we have agreed that this is not the right approach to handle invalid pointers in kfree and rather go soemthing like http://lkml.kernel.org/r/20170411141956.GP6729@dhcp22.suse.cz instead. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] mm: Add additional consistency check 2017-03-31 16:40 [PATCH] mm: Add additional consistency check Kees Cook 2017-03-31 21:33 ` Andrew Morton 2017-04-04 11:30 ` Michal Hocko @ 2017-04-11 18:30 ` Christoph Lameter 2 siblings, 0 replies; 41+ messages in thread From: Christoph Lameter @ 2017-04-11 18:30 UTC (permalink / raw) To: Kees Cook Cc: Andrew Morton, Pekka Enberg, David Rientjes, Joonsoo Kim, linux-mm, linux-kernel On Fri, 31 Mar 2017, Kees Cook wrote: > As found in PaX, this adds a cheap check on heap consistency, just to > notice if things have gotten corrupted in the page lookup. Ok this only affects kmem_cache_free() and not kfree(). For kmem_cache_free() we already have a lot of stuff in the hotpath due to cgruops. If you want this also for kfree() then we need a separate patch. Also for kmem_cache_free(): Here we always have a slab cache and thus we could check the flags that could modify what behavior we want. Acked-by: Christoph Lameter <cl@linux.com> ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2017-04-28 6:16 UTC | newest] Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-03-31 16:40 [PATCH] mm: Add additional consistency check Kees Cook 2017-03-31 21:33 ` Andrew Morton 2017-04-01 0:04 ` Kees Cook 2017-04-03 3:40 ` Michael Ellerman 2017-04-03 14:03 ` Christoph Lameter 2017-04-03 14:53 ` Matthew Wilcox 2017-04-04 11:30 ` Michal Hocko 2017-04-04 15:07 ` Christoph Lameter 2017-04-04 15:16 ` Michal Hocko 2017-04-04 15:46 ` Kees Cook 2017-04-04 15:58 ` Michal Hocko 2017-04-04 16:02 ` Kees Cook 2017-04-04 19:13 ` Christoph Lameter 2017-04-04 19:42 ` Michal Hocko 2017-04-04 19:58 ` Christoph Lameter 2017-04-04 20:13 ` Michal Hocko 2017-04-11 4:58 ` Kees Cook 2017-04-11 13:46 ` Michal Hocko 2017-04-11 14:14 ` Kees Cook 2017-04-11 14:19 ` Michal Hocko 2017-04-11 16:05 ` Kees Cook 2017-04-11 16:16 ` Christoph Lameter 2017-04-11 16:19 ` Kees Cook 2017-04-11 16:23 ` Christoph Lameter 2017-04-11 16:30 ` Kees Cook 2017-04-11 16:26 ` Christoph Lameter 2017-04-11 16:41 ` Michal Hocko 2017-04-11 18:03 ` Christoph Lameter 2017-04-11 18:30 ` Michal Hocko 2017-04-11 18:44 ` Christoph Lameter 2017-04-11 18:55 ` Michal Hocko 2017-04-11 18:59 ` Christoph Lameter 2017-04-11 19:39 ` Michal Hocko 2017-04-17 15:22 ` Christoph Lameter 2017-04-18 6:41 ` Michal Hocko 2017-04-18 13:31 ` Christoph Lameter 2017-04-18 13:37 ` Christoph Lameter 2017-04-28 1:11 ` Kees Cook 2017-04-28 6:16 ` Michal Hocko 2017-04-27 12:06 ` Michal Hocko 2017-04-11 18:30 ` Christoph Lameter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).