From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755303Ab0IOWxQ (ORCPT ); Wed, 15 Sep 2010 18:53:16 -0400 Received: from smtp-out.google.com ([74.125.121.35]:51136 "EHLO smtp-out.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754939Ab0IOWxP (ORCPT ); Wed, 15 Sep 2010 18:53:15 -0400 DomainKey-Signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns; h=date:from:x-x-sender:to:subject:in-reply-to:message-id: references:user-agent:mime-version:content-type:x-system-of-record; b=G7M0zpogzl/z+iKutU8QZQDTkbDXn+Nw3PVSU0DY8e6K6QBQENWTAOy7JwHkplUjF UfQPhFEhGMmyN6qiZ+gIQ== Date: Wed, 15 Sep 2010 15:53:06 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: "Ted Ts'o" , Pekka Enberg , Linus Torvalds , linux-kernel@vger.kernel.org, Christoph Lameter Subject: Re: [PATCH v2 2/2] SLUB: Mark merged slab caches in /proc/slabinfo In-Reply-To: <20100915222509.GE3730@thunk.org> Message-ID: References: <1284490101-2362-1-git-send-email-penberg@kernel.org> <1284490101-2362-2-git-send-email-penberg@kernel.org> <4C8FE263.5070101@kernel.org> <1097CAA8-8234-4FE2-BAA1-9C7D9FA01CEC@mit.edu> <20100915222509.GE3730@thunk.org> User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-System-Of-Record: true Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 15 Sep 2010, Ted Ts'o wrote: > All I can say is I hope the merging code is intelligent. We recently > had a problem where we were wasting huge amounts of memory because we > were allocating large numbers of a the ext4_group_info structure, > which was 132 bytes, and for which kmalloc() used a size-256 slab --- > and the wasted memory was enough to cause OOM's in a critical > (unfortunately statically sized) container when the disks got large > enough and numerous enough. The fix was to use a separate cache just > for these 132-byte objects, and not to use kmalloc(). > That's not cache merging and it wasn't with slub. kmalloc() allocates from caches that are initialized at boot with the smallest power-of-two size that allows the object with alignment to fit (and we have special 96-byte and 192-byte kmalloc caches because they tend to be popular). So with slub, a kmalloc(132, ...) would allocate from kmalloc-192 instead. Cache merging merges caches created with kmem_cache_create() with already existing caches, perhaps even those kmalloc caches, that have the same basic properties. There's some pretty strict requirements if a cache may be merged or not: it's alignment must be compatible, and the size must not waste more than 8 bytes on 64-bit. Debugging flags and things like SLAB_DESTORY_BY_RCU won't be merged, either. > I would be really annoyed if we switched to a slab allocator which did > merging, and then found that the said slab allocator helpfully merged > the 132-byte slab cache and the size-256 slab into a single slab > cache, on the grounds that it thought it would save memory... (I > guess I'm just really really nervous about merging happening behind my > back, and I really like having the per-object type allocation > statistics.) > Slub would allocate kmalloc(132, ...) from kmalloc-192, and it wouldn't merge your new cache created for ext4_group_info with any other cache unless it shared the same flags and had a size of 132-140 bytes with a compatible alignment. On my system, it looks likely that such a cache would get merged with the numa_policy cache.