All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pekka J Enberg <penberg@cs.helsinki.fi>
To: Tejun Heo <tj@kernel.org>
Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>,
	Christoph Lameter <cl@linux.com>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Kernel Testers List <kernel-testers@vger.kernel.org>,
	Maciej Rutecki <maciej.rutecki@gmail.com>,
	Alex Shi <alex.shi@intel.com>,
	tim.c.chen@intel.com, npiggin@suse.de, rientjes@google.com
Subject: Re: [Bug #15713] hackbench regression due to commit 9dfc6e68bfe6e
Date: Mon, 26 Apr 2010 17:17:21 +0300 (EEST)	[thread overview]
Message-ID: <alpine.DEB.2.00.1004261710500.16526@melkki.cs.helsinki.fi> (raw)
In-Reply-To: <4BD570A8.90304@kernel.org>

On 04/26/2010 12:09 PM, Pekka Enberg wrote:
>>> My wild speculation is that previously the cpu_slub structures of two
>>> neighboring threads ended up on the same cacheline by accident thanks
>>> to the back to back allocation.  W/ the percpu allocator, this no
>>> longer would happen as the allocator groups percpu data together
>>> per-cpu.
>>
>> Yanmin, do we see a lot of remote frees for your hackbench run? IIRC,
>> it's the "deactivate_remote_frees" stat when CONFIG_SLAB_STATS is
>> enabled.

On Mon, 26 Apr 2010, Tejun Heo wrote:
> I'm not familiar with the details or scales here so please take
> whatever I say with a grain of salt.  For hyperthreading configuration
> I think operations don't have to be remote to be affected.  If the
> data for cpu0 and cpu1 were on the same cache line, and cpu0 and cpu1
> are occupying the same physical core thus sharing all the resources it
> would benefit from the sharing whether any operation was remote or not
> as it saves the physical processor one cache line.

Even if the cacheline is dirtied like in the struct kmem_cache_cpu case? 
If that's the case, don't we want the per-CPU allocator to support 
back to back allocation for cores that are in the same package?

Btw, I focused on remote frees initially before I understood what you 
actually meant and scetched the following untested patch to take advantage 
of the fact that struct kmem_cache_cpu doesn't fill a whole cache line. It 
tries amortize remote free costs by "queuing" objects. It would be 
interesting to see if it helps here (or in the other SLUB regressions like 
netperf and the famous Intel one).

 			Pekka

diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 0249d41..b554a67 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -34,10 +34,14 @@ enum stat_item {
  	ORDER_FALLBACK,		/* Number of times fallback was necessary */
  	NR_SLUB_STAT_ITEMS };

+#define SLUB_MAX_NR_REMOTES	5
+
  struct kmem_cache_cpu {
  	void **freelist;	/* Pointer to first free per cpu object */
  	struct page *page;	/* The slab from which we are allocating */
  	int node;		/* The node of the page (or -1 for debug) */
+	int nr_remotes;		/* Number of remotely free'd objects */
+	void *remotelist[SLUB_MAX_NR_REMOTES];	/* List of remotely free'd objects */
  #ifdef CONFIG_SLUB_STATS
  	unsigned stat[NR_SLUB_STAT_ITEMS];
  #endif
diff --git a/mm/slub.c b/mm/slub.c
index 7d6c8b1..e8e5523 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1480,6 +1480,24 @@ static void deactivate_slab(struct kmem_cache *s, struct kmem_cache_cpu *c)
  	unfreeze_slab(s, page, tail);
  }

+static void __slab_free(struct kmem_cache *s, struct page *page, void *x, unsigned long addr);
+
+static void flush_remotelist(struct kmem_cache *s, struct kmem_cache_cpu *c)
+{
+	int i;
+
+	for (i = 0; i < c->nr_remotes; i++) {
+		struct page *page;
+		void *x;
+
+		x = c->remotelist[i];
+		page = virt_to_head_page(x);
+
+		__slab_free(s, page, x, _RET_IP_);
+	}
+	c->nr_remotes = 0;
+}
+
  static inline void flush_slab(struct kmem_cache *s, struct kmem_cache_cpu *c)
  {
  	stat(s, CPUSLAB_FLUSH);
@@ -1496,7 +1514,12 @@ static inline void __flush_cpu_slab(struct kmem_cache *s, int cpu)
  {
  	struct kmem_cache_cpu *c = per_cpu_ptr(s->cpu_slab, cpu);

-	if (likely(c && c->page))
+	if (unlikely(!c))
+		return;
+
+	flush_remotelist(s, c);
+
+	if (likely(c->page))
  		flush_slab(s, c);
  }

@@ -1709,6 +1732,8 @@ static __always_inline void *slab_alloc(struct kmem_cache *s,

  	local_irq_save(flags);
  	c = __this_cpu_ptr(s->cpu_slab);
+	if (unlikely(c->nr_remotes == SLUB_MAX_NR_REMOTES))
+		flush_remotelist(s, c);
  	object = c->freelist;
  	if (unlikely(!object || !node_match(c, node)))

@@ -1865,8 +1890,12 @@ static __always_inline void slab_free(struct kmem_cache *s,
  		set_freepointer(s, object, c->freelist);
  		c->freelist = object;
  		stat(s, FREE_FASTPATH);
-	} else
-		__slab_free(s, page, x, addr);
+	} else {
+		if (unlikely(c->nr_remotes == SLUB_MAX_NR_REMOTES))
+			flush_remotelist(s, c);
+
+		c->remotelist[c->nr_remotes++] = x;
+	}

  	local_irq_restore(flags);
  }

  reply	other threads:[~2010-04-26 14:17 UTC|newest]

Thread overview: 123+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-20  3:15 2.6.34-rc5: Reported regressions from 2.6.33 Rafael J. Wysocki
2010-04-20  3:15 ` Rafael J. Wysocki
2010-04-20  0:57 ` Andrew Morton
2010-04-20  0:57 ` Andrew Morton
2010-04-20  4:13   ` Rafael J. Wysocki
2010-04-20  4:13   ` Rafael J. Wysocki
2010-04-20  3:16 ` [Bug #15505] No more b43 wireless interface since 2.6.34-rc1 Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15553] Screen backlight doesn't come back on after lid was closed (GM45) Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15551] WARNING: at net/mac80211/work.c:811 ieee80211_work_work+0x7f/0xde8 [mac80211]() Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15589] 2.6.34-rc1: Badness at fs/proc/generic.c:316 Rafael J. Wysocki
2010-04-20  6:45   ` Christian Kujau
2010-04-20  6:45     ` Christian Kujau
2010-04-20 12:18     ` Michael Ellerman
2010-04-20 12:18       ` Michael Ellerman
2010-04-20 18:15       ` Christian Kujau
2010-04-20 18:15         ` Christian Kujau
2010-04-20 20:53         ` Andreas Schwab
2010-04-20 20:53           ` Andreas Schwab
2010-04-21  0:25         ` Michael Ellerman
2010-04-21  0:25           ` Michael Ellerman
2010-04-21 15:55       ` Alexey Dobriyan
2010-04-21 15:55         ` Alexey Dobriyan
2010-04-21  0:21         ` Michael Ellerman
2010-04-21  0:21           ` Michael Ellerman
2010-04-21  4:57           ` Rafael J. Wysocki
2010-04-21  4:57             ` Rafael J. Wysocki
2010-04-30  2:44   ` Stefan Lippers-Hollmann
2010-04-20  3:19 ` [Bug #15601] [BUG] SLOB breaks Crypto Rafael J. Wysocki
2010-04-20  6:40   ` Pekka Enberg
2010-04-20  6:42     ` Matt Mackall
2010-04-20  3:19 ` [Bug #15610] fsck leads to swapper - BUG: unable to handle kernel NULL pointer dereference & panic Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15625] BUG: 2.6.34-rc1, RIP is (null) Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15664] Graphics hang and kernel backtrace when starting Azureus with Compiz enabled Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15659] [Regresion] [2.6.34-rc1] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung Rafael J. Wysocki
2010-04-20 19:33   ` Maciej Rutecki
2010-04-21  4:59     ` Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15611] Failure with the 2.6.34-rc1 kernel Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15661] PROBLEM: crash on halt with 2.6.34-0.16.rc2.git0.fc14.x86_64 Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15590] 2.6.34-rc1: regression: ^Z no longer stops sound Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15668] start_kernel(): bug: interrupts were enabled early Rafael J. Wysocki
2010-04-20 14:11   ` Rabin Vincent
2010-04-21  5:04     ` Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15673] 2.6.34-rc2: "ima_dec_counts: open/free imbalance"? Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15698] Freeze on power-off / suspend to ram Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15671] intel graphic card hanging (Hangcheck timer elapsed... GPU hung) Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15672] KVM bug, git bisected Rafael J. Wysocki
2010-04-20 21:11   ` Rik van Riel
2010-04-21  3:36     ` Rik van Riel
2010-04-21  5:02       ` Rafael J. Wysocki
2010-04-21  6:20         ` Borislav Petkov
2010-04-21  8:45           ` Peter Zijlstra
2010-04-21 15:57             ` Rafael J. Wysocki
2010-04-21 16:03               ` Peter Zijlstra
2010-04-20  3:19 ` [Bug #15669] INFO: suspicious rcu_dereference_check() Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15713] hackbench regression due to commit 9dfc6e68bfe6e Rafael J. Wysocki
2010-04-22 15:45   ` Christoph Lameter
2010-04-22 17:26     ` Pekka Enberg
2010-04-23 19:18       ` Pekka J Enberg
2010-04-26  6:59         ` Zhang, Yanmin
2010-04-26  7:22           ` Pekka Enberg
2010-04-26 10:02             ` Tejun Heo
2010-04-26 10:09               ` Pekka Enberg
2010-04-26 10:53                 ` Tejun Heo
2010-04-26 14:17                   ` Pekka J Enberg [this message]
2010-04-26 14:33                     ` Tejun Heo
2010-04-27  1:41                 ` Zhang, Yanmin
2010-04-20  3:19 ` [Bug #15711] 2.6.34-rc3, BUG at mm/slab.c:2989 Rafael J. Wysocki
2010-04-20  9:00   ` Heinz Diehl
2010-04-20  3:19 ` [Bug #15704] [r8169] WARNING: at net/sched/sch_generic.c Rafael J. Wysocki
2010-04-26 12:51   ` Sergey Senozhatsky
2010-04-26 19:24     ` Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15712] [regression] 2.6.34-rc1 to -rc3 on zaurus: no longer boots Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15719] virtio_net causing kernel BUG when running under VirtualBox Rafael J. Wysocki
2010-04-23 14:10   ` Thomas Müller
2010-04-23 16:29     ` Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15744] [2.6.34-rc1 REGRESSION] ahci 0000:00:1f.2: controller reset failed (0xffffffff) Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15730] Ugly rmap NULL ptr deref oopsie on hibernate (was Linux 2.6.34-rc3) Rafael J. Wysocki
2010-04-20  5:27   ` Borislav Petkov
2010-04-21  5:09     ` Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15729] BUG: physmap modprobe & rmmod Rafael J. Wysocki
2010-04-20  4:53   ` Wolfram Sang
2010-04-20  4:53     ` Wolfram Sang
2010-04-20  4:58     ` Randy Dunlap
2010-04-20  4:58       ` Randy Dunlap
2010-04-20  3:19 ` [Bug #15717] bluetooth oops Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15774] 2.6.34-rc3: eth0 (8139too): transmit queue 0 timed out Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15790] Meta-Bug: Regressions Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15788] external usb sound card doesn't work after resume Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15768] Incorrectly calculated free blocks result in ENOSPC from writepage Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15796] [REGRESSION bisected] Sound goes too fast due to commit 7b3a177b0 Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15805] reiserfs locking Rafael J. Wysocki
2010-04-22  2:52   ` Frederic Weisbecker
2010-04-20  3:19 ` [Bug #15795] 2.6.34-rc4 : OOPS in unmap_vma Rafael J. Wysocki
2010-04-20 20:24   ` Parag Warudkar
2010-04-21  5:12     ` Rafael J. Wysocki
2010-04-20  3:19 ` [Bug #15812] utsname.domainname not set in x86_32 processes (causing "YPBINDPROC_DOMAIN: domain not bound" errors) Rafael J. Wysocki
2010-04-20 13:56 ` 2.6.34-rc5: Reported regressions from 2.6.33 Nick Bowler
2010-04-20 13:56 ` Nick Bowler
2010-04-21  5:15   ` Rafael J. Wysocki
2010-04-21  5:15   ` Rafael J. Wysocki
2010-04-21  5:15     ` Rafael J. Wysocki
2010-04-21  8:57     ` Jerome Glisse
2010-04-21  8:57       ` Jerome Glisse
2010-04-21 16:57       ` Nick Bowler
2010-04-21 16:57       ` Nick Bowler
2010-04-23 10:23         ` Jerome Glisse
     [not found]         ` <20100421165758.GA23565-7BP4RkwGw0uXmMXjJBpWqg@public.gmane.org>
2010-04-23 10:23           ` Jerome Glisse
2010-04-23 10:23             ` Jerome Glisse
2010-04-23 15:31             ` Nick Bowler
     [not found]             ` <20100423102338.GA3151-N6zOBCg9HoVSq9BJjBFyUp/QNRX+jHPU@public.gmane.org>
2010-04-23 15:31               ` Nick Bowler
2010-04-23 15:31                 ` Nick Bowler
2010-04-23 10:23         ` Jerome Glisse
2010-04-21  2:02 ` Ben Gamari
2010-04-21  2:02 ` Ben Gamari
2010-04-21  2:02   ` Ben Gamari
2010-04-21  5:14   ` Rafael J. Wysocki
2010-04-21  5:14   ` Rafael J. Wysocki
  -- strict thread matches above, loose matches on Subject: below --
2010-06-13 14:45 2.6.35-rc3: Reported regressions 2.6.33 -> 2.6.34 Rafael J. Wysocki
2010-06-13 14:48 ` [Bug #15713] hackbench regression due to commit 9dfc6e68bfe6e Rafael J. Wysocki
2010-06-13 17:08   ` Pekka Enberg
2010-06-15 14:49   ` Christoph Lameter
2010-05-09 21:13 2.6.34-rc6-git6: Reported regressions from 2.6.33 Rafael J. Wysocki
2010-05-09 21:17 ` [Bug #15713] hackbench regression due to commit 9dfc6e68bfe6e Rafael J. Wysocki
2010-05-04 20:49 2.6.34-rc6-git2: Reported regressions from 2.6.33 Rafael J. Wysocki
2010-05-04 21:21 ` [Bug #15713] hackbench regression due to commit 9dfc6e68bfe6e Rafael J. Wysocki
2010-04-07 21:08 2.6.34-rc3-git6: Reported regressions from 2.6.33 Rafael J. Wysocki
2010-04-07 21:13 ` [Bug #15713] hackbench regression due to commit 9dfc6e68bfe6e Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.00.1004261710500.16526@melkki.cs.helsinki.fi \
    --to=penberg@cs.helsinki.fi \
    --cc=alex.shi@intel.com \
    --cc=cl@linux.com \
    --cc=kernel-testers@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maciej.rutecki@gmail.com \
    --cc=npiggin@suse.de \
    --cc=rientjes@google.com \
    --cc=rjw@sisk.pl \
    --cc=tim.c.chen@intel.com \
    --cc=tj@kernel.org \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.