From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754095Ab0DZKJ0 (ORCPT ); Mon, 26 Apr 2010 06:09:26 -0400 Received: from mail-bw0-f219.google.com ([209.85.218.219]:34791 "EHLO mail-bw0-f219.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753450Ab0DZKJX convert rfc822-to-8bit (ORCPT ); Mon, 26 Apr 2010 06:09:23 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=jtfCQrLSQh7cFppvyWOv5nJ5larcAWby+da1FSHm+XUsv7Nwk4u9+NG251FXmy4UvT N/gKMLAFwUREMT5bYzvNaHk3Y6sI/83KCFT9MnNWJ0zFEm6B/wUiuFsRg3rmkRhZub+L RDFc6vpBYcF5ohwZp0bPrXk31RfL49FgCKgzg= MIME-Version: 1.0 In-Reply-To: <4BD564BE.6020700@kernel.org> References: <4BD086D0.9090309@cs.helsinki.fi> <1272265147.2078.648.camel@ymzhang.sh.intel.com> <4BD564BE.6020700@kernel.org> Date: Mon, 26 Apr 2010 13:09:21 +0300 X-Google-Sender-Auth: 7a5496ea4df98d7c Message-ID: Subject: Re: [Bug #15713] hackbench regression due to commit 9dfc6e68bfe6e From: Pekka Enberg To: Tejun Heo Cc: "Zhang, Yanmin" , Christoph Lameter , "Rafael J. Wysocki" , Linux Kernel Mailing List , Kernel Testers List , Maciej Rutecki , Alex Shi , tim.c.chen@intel.com Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Mon, Apr 26, 2010 at 9:59 AM, Zhang, Yanmin wrote: >>>> I haven't been able to reproduce this either on my Core 2 machine. >>> Mostly, the regression exists on Nehalem machines. I suspect it's related to >>> hyper-threading machine. On 04/26/2010 09:22 AM, Pekka Enberg wrote: >> OK, so does anyone know why hyper-threading would change things for >> the per-CPU allocator? On Mon, Apr 26, 2010 at 1:02 PM, Tejun Heo wrote: > My wild speculation is that previously the cpu_slub structures of two > neighboring threads ended up on the same cacheline by accident thanks > to the back to back allocation.  W/ the percpu allocator, this no > longer would happen as the allocator groups percpu data together > per-cpu. Yanmin, do we see a lot of remote frees for your hackbench run? IIRC, it's the "deactivate_remote_frees" stat when CONFIG_SLAB_STATS is enabled. Pekka From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pekka Enberg Subject: Re: [Bug #15713] hackbench regression due to commit 9dfc6e68bfe6e Date: Mon, 26 Apr 2010 13:09:21 +0300 Message-ID: References: <4BD086D0.9090309@cs.helsinki.fi> <1272265147.2078.648.camel@ymzhang.sh.intel.com> <4BD564BE.6020700@kernel.org> Mime-Version: 1.0 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=MVypF8nKN/MOE94B5qczRlstc8X2g74HQjTyOQ3KPDE=; b=sYoih4Dw/5WWLCT4LIIWhp6FjFaI9l79U5scRFPwTtSoQOpFgvAiTcGDipVOlJjLfn PC8Pph6O2uR13xyxwHu8zF6SUdkF2RniPkBVBvmUq2gOObYGSGh/IqjpvVGGQ8sBGlEX D2w0PNPN+LI5RF77IeAYoxVs8lWceqZX0K6cA= In-Reply-To: <4BD564BE.6020700-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Sender: kernel-testers-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: text/plain; charset="iso-8859-1" To: Tejun Heo Cc: "Zhang, Yanmin" , Christoph Lameter , "Rafael J. Wysocki" , Linux Kernel Mailing List , Kernel Testers List , Maciej Rutecki , Alex Shi , tim.c.chen-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org Hi, On Mon, Apr 26, 2010 at 9:59 AM, Zhang, Yanmin wrote: >>>> I haven't been able to reproduce this either on my Core 2 machine. >>> Mostly, the regression exists on Nehalem machines. I suspect it's r= elated to >>> hyper-threading machine. On 04/26/2010 09:22 AM, Pekka Enberg wrote: >> OK, so does anyone know why hyper-threading would change things for >> the per-CPU allocator? On Mon, Apr 26, 2010 at 1:02 PM, Tejun Heo wrote: > My wild speculation is that previously the cpu_slub structures of two > neighboring threads ended up on the same cacheline by accident thanks > to the back to back allocation. =A0W/ the percpu allocator, this no > longer would happen as the allocator groups percpu data together > per-cpu. Yanmin, do we see a lot of remote frees for your hackbench run? IIRC, it's the "deactivate_remote_frees" stat when CONFIG_SLAB_STATS is enabled. Pekka