From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751441AbeAPRnX (ORCPT + 1 other); Tue, 16 Jan 2018 12:43:23 -0500 Received: from bombadil.infradead.org ([65.50.211.133]:35189 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751107AbeAPRnT (ORCPT ); Tue, 16 Jan 2018 12:43:19 -0500 Date: Tue, 16 Jan 2018 09:43:15 -0800 From: Matthew Wilcox To: Christopher Lameter Cc: Kees Cook , linux-kernel@vger.kernel.org, David Windsor , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , linux-mm@kvack.org, linux-xfs@vger.kernel.org, Linus Torvalds , Alexander Viro , Andy Lutomirski , Christoph Hellwig , "David S. Miller" , Laura Abbott , Mark Rutland , "Martin K. Petersen" , Paolo Bonzini , Christian Borntraeger , Christoffer Dall , Dave Kleikamp , Jan Kara , Luis de Bethencourt , Marc Zyngier , Rik van Riel , Matthew Garrett , linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, netdev@vger.kernel.org, kernel-hardening@lists.openwall.com Subject: Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting) Message-ID: <20180116174315.GA10461@bombadil.infradead.org> References: <1515531365-37423-1-git-send-email-keescook@chromium.org> <1515531365-37423-5-git-send-email-keescook@chromium.org> <20180114230719.GB32027@bombadil.infradead.org> <20180116160525.GF30073@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Tue, Jan 16, 2018 at 10:54:27AM -0600, Christopher Lameter wrote: > On Tue, 16 Jan 2018, Matthew Wilcox wrote: > > > I think that's a good thing! /proc/slabinfo really starts to get grotty > > above 16 bytes. I'd like to chop off "_cache" from the name of every > > single slab! If ext4_allocation_context has to become ext4_alloc_ctx, > > I don't think we're going to lose any valuable information. > > Ok so we are going to cut off at 16 charaacters? Sounds good to me. Excellent! > > > struct kmem_cache_attr { > > > char *name; > > > size_t size; > > > size_t align; > > > slab_flags_t flags; > > > unsigned int useroffset; > > > unsinged int usersize; > > > void (*ctor)(void *); > > > kmem_isolate_func *isolate; > > > kmem_migrate_func *migrate; > > > ... > > > } > > > > In these slightly-more-security-conscious days, it's considered poor > > practice to have function pointers in writable memory. That was why > > I wanted to make the kmem_cache_attr const. > > Sure this data is never changed. It can be const. It's changed at initialisation. Look: kmem_cache_create(const char *name, size_t size, size_t align, slab_flags_t flags, void (*ctor)(void *)) s = create_cache(cache_name, size, size, calculate_alignment(flags, align, size), flags, ctor, NULL, NULL); The 'align' that ends up in s->align, is not the user-specified align. It's also dependent on runtime information (cache_line_size()), so it can't be calculated at compile time. 'flags' also gets mangled: flags &= CACHE_CREATE_MASK; > I am not married to either way of specifying the sizes. unsigned int would > be fine with me. SLUB falls back to the page allocator anyways for > anything above 2* PAGE_SIZE and I think we can do the same for the other > allocators as well. Zeroing or initializing such a large memory chunk is > much more expensive than the allocation so it does not make much sense to > have that directly supported in the slab allocators. The only slabs larger than 4kB on my system right now are: kvm_vcpu 0 0 19136 1 8 : tunables 8 4 0 : slabdata 0 0 0 net_namespace 1 1 6080 1 2 : tunables 8 4 0 : slabdata 1 1 0 (other than the fake slabs for kmalloc) > Some platforms support 64K page size and I could envision a 2M page size > at some point. So I think we cannot use 16 bits there. > > If no one objects then I can use unsigned int there again. unsigned int would be my preference. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matthew Wilcox Subject: Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting) Date: Tue, 16 Jan 2018 09:43:15 -0800 Message-ID: <20180116174315.GA10461@bombadil.infradead.org> References: <1515531365-37423-1-git-send-email-keescook@chromium.org> <1515531365-37423-5-git-send-email-keescook@chromium.org> <20180114230719.GB32027@bombadil.infradead.org> <20180116160525.GF30073@bombadil.infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Kees Cook , linux-kernel@vger.kernel.org, David Windsor , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , linux-mm@kvack.org, linux-xfs@vger.kernel.org, Linus Torvalds , Alexander Viro , Andy Lutomirski , Christoph Hellwig , "David S. Miller" , Laura Abbott , Mark Rutland , "Martin K. Petersen" , Paolo Bonzini , Christian Borntraeger , Christoffer Dall , Dave Kleikamp , Jan Kara , Luis de Bethencourt , Marc Z To: Christopher Lameter Return-path: Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-Id: netdev.vger.kernel.org On Tue, Jan 16, 2018 at 10:54:27AM -0600, Christopher Lameter wrote: > On Tue, 16 Jan 2018, Matthew Wilcox wrote: > > > I think that's a good thing! /proc/slabinfo really starts to get grotty > > above 16 bytes. I'd like to chop off "_cache" from the name of every > > single slab! If ext4_allocation_context has to become ext4_alloc_ctx, > > I don't think we're going to lose any valuable information. > > Ok so we are going to cut off at 16 charaacters? Sounds good to me. Excellent! > > > struct kmem_cache_attr { > > > char *name; > > > size_t size; > > > size_t align; > > > slab_flags_t flags; > > > unsigned int useroffset; > > > unsinged int usersize; > > > void (*ctor)(void *); > > > kmem_isolate_func *isolate; > > > kmem_migrate_func *migrate; > > > ... > > > } > > > > In these slightly-more-security-conscious days, it's considered poor > > practice to have function pointers in writable memory. That was why > > I wanted to make the kmem_cache_attr const. > > Sure this data is never changed. It can be const. It's changed at initialisation. Look: kmem_cache_create(const char *name, size_t size, size_t align, slab_flags_t flags, void (*ctor)(void *)) s = create_cache(cache_name, size, size, calculate_alignment(flags, align, size), flags, ctor, NULL, NULL); The 'align' that ends up in s->align, is not the user-specified align. It's also dependent on runtime information (cache_line_size()), so it can't be calculated at compile time. 'flags' also gets mangled: flags &= CACHE_CREATE_MASK; > I am not married to either way of specifying the sizes. unsigned int would > be fine with me. SLUB falls back to the page allocator anyways for > anything above 2* PAGE_SIZE and I think we can do the same for the other > allocators as well. Zeroing or initializing such a large memory chunk is > much more expensive than the allocation so it does not make much sense to > have that directly supported in the slab allocators. The only slabs larger than 4kB on my system right now are: kvm_vcpu 0 0 19136 1 8 : tunables 8 4 0 : slabdata 0 0 0 net_namespace 1 1 6080 1 2 : tunables 8 4 0 : slabdata 1 1 0 (other than the fake slabs for kmalloc) > Some platforms support 64K page size and I could envision a 2M page size > at some point. So I think we cannot use 16 bits there. > > If no one objects then I can use unsigned int there again. unsigned int would be my preference. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Tue, 16 Jan 2018 09:43:15 -0800 From: Matthew Wilcox To: Christopher Lameter Cc: Kees Cook , linux-kernel@vger.kernel.org, David Windsor , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , linux-mm@kvack.org, linux-xfs@vger.kernel.org, Linus Torvalds , Alexander Viro , Andy Lutomirski , Christoph Hellwig , "David S. Miller" , Laura Abbott , Mark Rutland , "Martin K. Petersen" , Paolo Bonzini , Christian Borntraeger , Christoffer Dall , Dave Kleikamp , Jan Kara , Luis de Bethencourt , Marc Zyngier , Rik van Riel , Matthew Garrett , linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, netdev@vger.kernel.org, kernel-hardening@lists.openwall.com Subject: Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting) Message-ID: <20180116174315.GA10461@bombadil.infradead.org> References: <1515531365-37423-1-git-send-email-keescook@chromium.org> <1515531365-37423-5-git-send-email-keescook@chromium.org> <20180114230719.GB32027@bombadil.infradead.org> <20180116160525.GF30073@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: On Tue, Jan 16, 2018 at 10:54:27AM -0600, Christopher Lameter wrote: > On Tue, 16 Jan 2018, Matthew Wilcox wrote: > > > I think that's a good thing! /proc/slabinfo really starts to get grotty > > above 16 bytes. I'd like to chop off "_cache" from the name of every > > single slab! If ext4_allocation_context has to become ext4_alloc_ctx, > > I don't think we're going to lose any valuable information. > > Ok so we are going to cut off at 16 charaacters? Sounds good to me. Excellent! > > > struct kmem_cache_attr { > > > char *name; > > > size_t size; > > > size_t align; > > > slab_flags_t flags; > > > unsigned int useroffset; > > > unsinged int usersize; > > > void (*ctor)(void *); > > > kmem_isolate_func *isolate; > > > kmem_migrate_func *migrate; > > > ... > > > } > > > > In these slightly-more-security-conscious days, it's considered poor > > practice to have function pointers in writable memory. That was why > > I wanted to make the kmem_cache_attr const. > > Sure this data is never changed. It can be const. It's changed at initialisation. Look: kmem_cache_create(const char *name, size_t size, size_t align, slab_flags_t flags, void (*ctor)(void *)) s = create_cache(cache_name, size, size, calculate_alignment(flags, align, size), flags, ctor, NULL, NULL); The 'align' that ends up in s->align, is not the user-specified align. It's also dependent on runtime information (cache_line_size()), so it can't be calculated at compile time. 'flags' also gets mangled: flags &= CACHE_CREATE_MASK; > I am not married to either way of specifying the sizes. unsigned int would > be fine with me. SLUB falls back to the page allocator anyways for > anything above 2* PAGE_SIZE and I think we can do the same for the other > allocators as well. Zeroing or initializing such a large memory chunk is > much more expensive than the allocation so it does not make much sense to > have that directly supported in the slab allocators. The only slabs larger than 4kB on my system right now are: kvm_vcpu 0 0 19136 1 8 : tunables 8 4 0 : slabdata 0 0 0 net_namespace 1 1 6080 1 2 : tunables 8 4 0 : slabdata 1 1 0 (other than the fake slabs for kmalloc) > Some platforms support 64K page size and I could envision a 2M page size > at some point. So I think we cannot use 16 bits there. > > If no one objects then I can use unsigned int there again. unsigned int would be my preference. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matthew Wilcox Subject: Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting) Date: Tue, 16 Jan 2018 09:43:15 -0800 Message-ID: <20180116174315.GA10461@bombadil.infradead.org> References: <1515531365-37423-1-git-send-email-keescook@chromium.org> <1515531365-37423-5-git-send-email-keescook@chromium.org> <20180114230719.GB32027@bombadil.infradead.org> <20180116160525.GF30073@bombadil.infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org To: Christopher Lameter Cc: Kees Cook , linux-kernel@vger.kernel.org, David Windsor , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , linux-mm@kvack.org, linux-xfs@vger.kernel.org, Linus Torvalds , Alexander Viro , Andy Lutomirski , Christoph Hellwig , "David S. Miller" , Laura Abbott , Mark Rutland , "Martin K. Petersen" , Paolo Bonzini , Christian Borntraeger , Christoffer Dall , Dave Kleikamp , Jan Kara , Luis de Bethencourt , Marc List-Id: linux-arch.vger.kernel.org On Tue, Jan 16, 2018 at 10:54:27AM -0600, Christopher Lameter wrote: > On Tue, 16 Jan 2018, Matthew Wilcox wrote: > > > I think that's a good thing! /proc/slabinfo really starts to get grotty > > above 16 bytes. I'd like to chop off "_cache" from the name of every > > single slab! If ext4_allocation_context has to become ext4_alloc_ctx, > > I don't think we're going to lose any valuable information. > > Ok so we are going to cut off at 16 charaacters? Sounds good to me. Excellent! > > > struct kmem_cache_attr { > > > char *name; > > > size_t size; > > > size_t align; > > > slab_flags_t flags; > > > unsigned int useroffset; > > > unsinged int usersize; > > > void (*ctor)(void *); > > > kmem_isolate_func *isolate; > > > kmem_migrate_func *migrate; > > > ... > > > } > > > > In these slightly-more-security-conscious days, it's considered poor > > practice to have function pointers in writable memory. That was why > > I wanted to make the kmem_cache_attr const. > > Sure this data is never changed. It can be const. It's changed at initialisation. Look: kmem_cache_create(const char *name, size_t size, size_t align, slab_flags_t flags, void (*ctor)(void *)) s = create_cache(cache_name, size, size, calculate_alignment(flags, align, size), flags, ctor, NULL, NULL); The 'align' that ends up in s->align, is not the user-specified align. It's also dependent on runtime information (cache_line_size()), so it can't be calculated at compile time. 'flags' also gets mangled: flags &= CACHE_CREATE_MASK; > I am not married to either way of specifying the sizes. unsigned int would > be fine with me. SLUB falls back to the page allocator anyways for > anything above 2* PAGE_SIZE and I think we can do the same for the other > allocators as well. Zeroing or initializing such a large memory chunk is > much more expensive than the allocation so it does not make much sense to > have that directly supported in the slab allocators. The only slabs larger than 4kB on my system right now are: kvm_vcpu 0 0 19136 1 8 : tunables 8 4 0 : slabdata 0 0 0 net_namespace 1 1 6080 1 2 : tunables 8 4 0 : slabdata 1 1 0 (other than the fake slabs for kmalloc) > Some platforms support 64K page size and I could envision a 2M page size > at some point. So I think we cannot use 16 bits there. > > If no one objects then I can use unsigned int there again. unsigned int would be my preference. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Tue, 16 Jan 2018 09:43:15 -0800 From: Matthew Wilcox Message-ID: <20180116174315.GA10461@bombadil.infradead.org> References: <1515531365-37423-1-git-send-email-keescook@chromium.org> <1515531365-37423-5-git-send-email-keescook@chromium.org> <20180114230719.GB32027@bombadil.infradead.org> <20180116160525.GF30073@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: [kernel-hardening] Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting) To: Christopher Lameter Cc: Kees Cook , linux-kernel@vger.kernel.org, David Windsor , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , linux-mm@kvack.org, linux-xfs@vger.kernel.org, Linus Torvalds , Alexander Viro , Andy Lutomirski , Christoph Hellwig , "David S. Miller" , Laura Abbott , Mark Rutland , "Martin K. Petersen" , Paolo Bonzini , Christian Borntraeger , Christoffer Dall , Dave Kleikamp , Jan Kara , Luis de Bethencourt , Marc Zyngier , Rik van Riel , Matthew Garrett , linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, netdev@vger.kernel.org, kernel-hardening@lists.openwall.com List-ID: On Tue, Jan 16, 2018 at 10:54:27AM -0600, Christopher Lameter wrote: > On Tue, 16 Jan 2018, Matthew Wilcox wrote: > > > I think that's a good thing! /proc/slabinfo really starts to get grotty > > above 16 bytes. I'd like to chop off "_cache" from the name of every > > single slab! If ext4_allocation_context has to become ext4_alloc_ctx, > > I don't think we're going to lose any valuable information. > > Ok so we are going to cut off at 16 charaacters? Sounds good to me. Excellent! > > > struct kmem_cache_attr { > > > char *name; > > > size_t size; > > > size_t align; > > > slab_flags_t flags; > > > unsigned int useroffset; > > > unsinged int usersize; > > > void (*ctor)(void *); > > > kmem_isolate_func *isolate; > > > kmem_migrate_func *migrate; > > > ... > > > } > > > > In these slightly-more-security-conscious days, it's considered poor > > practice to have function pointers in writable memory. That was why > > I wanted to make the kmem_cache_attr const. > > Sure this data is never changed. It can be const. It's changed at initialisation. Look: kmem_cache_create(const char *name, size_t size, size_t align, slab_flags_t flags, void (*ctor)(void *)) s = create_cache(cache_name, size, size, calculate_alignment(flags, align, size), flags, ctor, NULL, NULL); The 'align' that ends up in s->align, is not the user-specified align. It's also dependent on runtime information (cache_line_size()), so it can't be calculated at compile time. 'flags' also gets mangled: flags &= CACHE_CREATE_MASK; > I am not married to either way of specifying the sizes. unsigned int would > be fine with me. SLUB falls back to the page allocator anyways for > anything above 2* PAGE_SIZE and I think we can do the same for the other > allocators as well. Zeroing or initializing such a large memory chunk is > much more expensive than the allocation so it does not make much sense to > have that directly supported in the slab allocators. The only slabs larger than 4kB on my system right now are: kvm_vcpu 0 0 19136 1 8 : tunables 8 4 0 : slabdata 0 0 0 net_namespace 1 1 6080 1 2 : tunables 8 4 0 : slabdata 1 1 0 (other than the fake slabs for kmalloc) > Some platforms support 64K page size and I could envision a 2M page size > at some point. So I think we cannot use 16 bits there. > > If no one objects then I can use unsigned int there again. unsigned int would be my preference.