From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758094Ab0DGQxE (ORCPT ); Wed, 7 Apr 2010 12:53:04 -0400 Received: from courier.cs.helsinki.fi ([128.214.9.1]:43060 "EHLO mail.cs.helsinki.fi" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756399Ab0DGQxA (ORCPT ); Wed, 7 Apr 2010 12:53:00 -0400 Message-ID: <4BBCB868.2000705@cs.helsinki.fi> Date: Wed, 07 Apr 2010 19:52:56 +0300 From: Pekka Enberg User-Agent: Thunderbird 2.0.0.24 (Macintosh/20100228) MIME-Version: 1.0 To: Christoph Lameter CC: "Zhang, Yanmin" , Eric Dumazet , netdev , Tejun Heo , alex.shi@intel.com, "linux-kernel@vger.kernel.org" , "Ma, Ling" , "Chen, Tim C" , Andrew Morton Subject: Re: hackbench regression due to commit 9dfc6e68bfe6e References: <1269506457.4513.141.camel@alexs-hp.sh.intel.com> <1269570902.9614.92.camel@alexs-hp.sh.intel.com> <1270114166.2078.107.camel@ymzhang.sh.intel.com> <1270195589.2078.116.camel@ymzhang.sh.intel.com> <4BBA8DF9.8010409@kernel.org> <1270542497.2078.123.camel@ymzhang.sh.intel.com> <1270591841.2091.170.camel@edumazet-laptop> <1270607668.2078.259.camel@ymzhang.sh.intel.com> <4BBCB7B7.4040901@cs.helsinki.fi> In-Reply-To: <4BBCB7B7.4040901@cs.helsinki.fi> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Pekka Enberg wrote: > Christoph Lameter wrote: >> I wonder if this is not related to the kmem_cache_cpu structure >> straggling >> cache line boundaries under some conditions. On 2.6.33 the kmem_cache_cpu >> structure was larger and therefore tight packing resulted in different >> alignment. >> >> Could you see how the following patch affects the results. It attempts to >> increase the size of kmem_cache_cpu to a power of 2 bytes. There is also >> the potential that other per cpu fetches to neighboring objects affect >> the >> situation. We could cacheline align the whole thing. >> >> --- >> include/linux/slub_def.h | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> Index: linux-2.6/include/linux/slub_def.h >> =================================================================== >> --- linux-2.6.orig/include/linux/slub_def.h 2010-04-07 >> 11:33:50.000000000 -0500 >> +++ linux-2.6/include/linux/slub_def.h 2010-04-07 >> 11:35:18.000000000 -0500 >> @@ -38,6 +38,11 @@ struct kmem_cache_cpu { >> void **freelist; /* Pointer to first free per cpu object */ >> struct page *page; /* The slab from which we are allocating */ >> int node; /* The node of the page (or -1 for debug) */ >> +#ifndef CONFIG_64BIT >> + int dummy1; >> +#endif >> + unsigned long dummy2; >> + >> #ifdef CONFIG_SLUB_STATS >> unsigned stat[NR_SLUB_STAT_ITEMS]; >> #endif > > Would __cacheline_aligned_in_smp do the trick here? Oh, sorry, I think it's actually '____cacheline_aligned_in_smp' (with four underscores) for per-cpu data. Confusing...