From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751323Ab0DFI03 (ORCPT ); Tue, 6 Apr 2010 04:26:29 -0400 Received: from mga09.intel.com ([134.134.136.24]:38552 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751048Ab0DFI0X (ORCPT ); Tue, 6 Apr 2010 04:26:23 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.51,371,1267430400"; d="scan'208";a="610551402" Subject: Re: hackbench regression due to commit 9dfc6e68bfe6e From: "Zhang, Yanmin" To: Tejun Heo Cc: Pekka Enberg , Christoph Lameter , alex.shi@intel.com, "linux-kernel@vger.kernel.org" , "Ma, Ling" , "Chen, Tim C" , Andrew Morton In-Reply-To: <4BBA8DF9.8010409@kernel.org> References: <1269506457.4513.141.camel@alexs-hp.sh.intel.com> <1269570902.9614.92.camel@alexs-hp.sh.intel.com> <1270114166.2078.107.camel@ymzhang.sh.intel.com> <1270195589.2078.116.camel@ymzhang.sh.intel.com> <4BBA8DF9.8010409@kernel.org> Content-Type: text/plain; charset="ISO-8859-1" Date: Tue, 06 Apr 2010 16:28:17 +0800 Message-Id: <1270542497.2078.123.camel@ymzhang.sh.intel.com> Mime-Version: 1.0 X-Mailer: Evolution 2.28.0 (2.28.0-2.fc12) Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2010-04-06 at 10:27 +0900, Tejun Heo wrote: > Hello, > > On 04/06/2010 02:30 AM, Pekka Enberg wrote: > >> Hmnmmm... The dynamic percpu areas use page tables and that data is used > >> in the fast path. Maybe the high thread count causes tlb trashing? > > > > Hmm indeed. I don't see anything particularly funny in the SLUB percpu > > conversion so maybe this is a more issue with the new percpu > > allocator? > > By default, percpu allocator embeds the first chunk in the kernel > linear mapping and accesses there shouldn't involve any TLB overhead. > >From the second chunk on, they're mapped page-by-page into vmalloc > area. This can be updated to use larger page mapping but 2M page > per-cpu is pretty large and the trade off hasn't been right yet. > > The amount reserved for dynamic allocation in the first chunk is > determined by PERCPU_DYNAMIC_RESERVE constant in > include/linux/percpu.h. It's currently 20k on 64bit machines and 12k > on 32bit. The intention was to size this such that most common stuff > is allocated from this area. The 20k and 12k are numbers that I > pulled out of my ass :-) with the custom config I used. Now that more > stuff has been converted to dynamic percpu, it's quite possible that > the area is too small. Can you please try to increase the size of the > area (say 2 or 4 times) and see whether the performance regression > goes away? Thanks. I tried 2 and 4 times and didn't see much improvement. I checked /proc/vamallocinfo and it doesn't have item of pcpu_get_vm_areas when I use 4 times of PERCPU_DYNAMIC_RESERVE. I used perf to collect dtlb misses and LLC misses. dtlb miss data is not stable. Sometimes, we have a bigger dtlb miss, but get a better result. LLC misses data are more stable. Only LLC-load-misses is the clear sign now. LLC-store-misses has no big difference.