From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f49.google.com (mail-qg0-f49.google.com [209.85.192.49]) by kanga.kvack.org (Postfix) with ESMTP id 468D86B0038 for ; Fri, 17 Apr 2015 02:06:19 -0400 (EDT) Received: by qgfi89 with SMTP id i89so17000016qgf.1 for ; Thu, 16 Apr 2015 23:06:19 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id g93si10736943qkh.94.2015.04.16.23.06.17 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 16 Apr 2015 23:06:18 -0700 (PDT) Date: Fri, 17 Apr 2015 08:06:10 +0200 From: Jesper Dangaard Brouer Subject: Re: slub: bulk allocation from per cpu partial pages Message-ID: <20150417080610.4ae80965@redhat.com> In-Reply-To: <20150417074446.6dd16121@redhat.com> References: <20150408155304.4480f11f16b60f09879c350d@linux-foundation.org> <20150416140638.684838a2@redhat.com> <20150417074446.6dd16121@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Christoph Lameter Cc: Andrew Morton , Joonsoo Kim , Pekka Enberg , David Rientjes , linux-mm@kvack.org, brouer@redhat.com On Fri, 17 Apr 2015 07:44:46 +0200 Jesper Dangaard Brouer wrote: > On Thu, 16 Apr 2015 10:54:07 -0500 (CDT) > Christoph Lameter wrote: > > > On Thu, 16 Apr 2015, Jesper Dangaard Brouer wrote: > > > > > On CPU E5-2630 @ 2.30GHz, the cost of kmem_cache_alloc + > > > kmem_cache_free, is a tight loop (most optimal fast-path), cost 22ns. > > > With elem size 256 bytes, where slab chooses to make 32 obj-per-slab. > > > > > > With this patch, testing different bulk sizes, the cost of alloc+free > > > per element is improved for small sizes of bulk (which I guess this the > > > is expected outcome). > > > > > > Have something to compare against, I also ran the bulk sizes through > > > the fallback versions __kmem_cache_alloc_bulk() and > > > __kmem_cache_free_bulk(), e.g. the none optimized versions. > > > > > > size -- optimized -- fallback > > > bulk 8 -- 15ns -- 22ns > > > bulk 16 -- 15ns -- 22ns > > > > Good. > > > > > bulk 30 -- 44ns -- 48ns > > > bulk 32 -- 47ns -- 50ns > > > bulk 64 -- 52ns -- 54ns > > > > Hmm.... We are hittling the atomics I guess... What you got so far is only > > using the per cpu data. Wonder how many partial pages are available > > Ups, I can see that this kernel don't have CONFIG_SLUB_CPU_PARTIAL, > I'll re-run tests with this enabled. Results with CONFIG_SLUB_CPU_PARTIAL. size -- optimized -- fallback bulk 8 -- 16ns -- 22ns bulk 16 -- 16ns -- 22ns bulk 30 -- 16ns -- 22ns bulk 32 -- 16ns -- 22ns bulk 64 -- 30ns -- 38ns -- Best regards, Jesper Dangaard Brouer MSc.CS, Sr. Network Kernel Developer at Red Hat Author of http://www.iptv-analyzer.org LinkedIn: http://www.linkedin.com/in/brouer -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org