All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Lameter <cl@linux.com>
To: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: linux-mm@kvack.org
Subject: [RFC V2 SLEB 13/14] SLEB: Enhanced NUMA support
Date: Fri, 21 May 2010 16:15:05 -0500	[thread overview]
Message-ID: <20100521211544.756019063@quilx.com> (raw)
In-Reply-To: 20100521211452.659982351@quilx.com

[-- Attachment #1: sled_numa --]
[-- Type: text/plain, Size: 3885 bytes --]

Before this patch all queues in SLEB may contain mixed objects (from any node).
This will continue even with this patch unless the SLAB has SLAB_MEM_SPREAD set.

For SLAB_MEM_SPREAD slabs an ordering by locality is enforced and objects are
managed per NUMA node (like SLAB). Cpu queues only contain
objects from the local node. Alien Objects (from non local nodes)
are freed into the shared cache of the remote node (avoids alien caches
but introduces cold cache objects into the shared cache).

This also adds object level NUMA functionality like in SLAB that can be
managed via cpusets or memory policies.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 mm/slub.c |   70 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2010-05-20 16:57:14.000000000 -0500
+++ linux-2.6/mm/slub.c	2010-05-20 16:57:37.000000000 -0500
@@ -1718,6 +1718,24 @@ void retrieve_objects(struct kmem_cache 
 	}
 }
 
+static inline int find_numa_node(struct kmem_cache *s, int selected_node)
+{
+#ifdef CONFIG_NUMA
+	if (s->flags & SLAB_MEM_SPREAD &&
+			!in_interrupt() &&
+			selected_node == SLAB_NODE_UNSPECIFIED) {
+
+		if (cpuset_do_slab_mem_spread())
+			return cpuset_mem_spread_node();
+
+		if (current->mempolicy)
+			return slab_node(current->mempolicy);
+	}
+#endif
+	return selected_node;
+}
+
+
 static void *slab_alloc(struct kmem_cache *s,
 		gfp_t gfpflags, int node, unsigned long addr)
 {
@@ -1732,6 +1750,7 @@ static void *slab_alloc(struct kmem_cach
 		return NULL;
 
 redo:
+	node = find_numa_node(s, node);
 	local_irq_save(flags);
 	c = __this_cpu_ptr(s->cpu_slab);
 	if (unlikely(!c->objects || !node_match(c, node))) {
@@ -1877,6 +1896,54 @@ void *kmem_cache_alloc_node_notrace(stru
 EXPORT_SYMBOL(kmem_cache_alloc_node_notrace);
 #endif
 
+int numa_off_node_free(struct kmem_cache *s, void *x)
+{
+#ifdef CONFIG_NUMA
+	if (s->flags & SLAB_MEM_SPREAD) {
+		int node = page_to_nid(virt_to_page(x));
+		/*
+		 * Slab requires object level control of locality. We can only
+		 * keep objects from the local node in the per cpu queue other
+		 * foreign object must not be freed to the queue.
+		 *
+		 * If we enconter a free of an off node object then we free
+		 * it to the shared cache of that node. This places a cache
+		 * cold object into that queue though. But using the queue
+		 * is much more effective than going directly into the slab.
+		 *
+		 * Alternate approach: Call drain_objects directly for a single
+		 * object. (Drain objects would have to be fixed to not save
+		 * to the local shared mem cache by default).
+		 */
+		if (node != numa_node_id()) {
+			struct kmem_cache_node *n = get_node(s, node);
+redo:
+			if (n->objects >= s->shared) {
+				int t = min(s->batch, n->objects);
+
+				drain_objects(s, n->object, t);
+
+				n->objects -= t;
+				if (n->objects)
+					memcpy(n->object, n->object + t,
+						n->objects * sizeof(void *));
+			}
+			spin_lock(&n->shared_lock);
+			if (n->objects < s->shared) {
+				n->object[n->objects++] = x;
+				x = NULL;
+			}
+			spin_unlock(&n->shared_lock);
+			if (x)
+				goto redo;
+			return 1;
+		}
+	}
+#endif
+	return 0;
+}
+
+
 static void slab_free(struct kmem_cache *s,
 			void *x, unsigned long addr)
 {
@@ -1895,6 +1962,9 @@ static void slab_free(struct kmem_cache 
 	if (!(s->flags & SLAB_DEBUG_OBJECTS))
 		debug_check_no_obj_freed(object, s->objsize);
 
+	if (numa_off_node_free(s, x))
+		goto out;
+
 	if (unlikely(c->objects >= s->queue)) {
 
 		int t = min(s->batch, c->objects);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-05-21 21:19 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-21 21:14 [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator Christoph Lameter
2010-05-21 21:14 ` [RFC V2 SLEB 01/14] slab: Introduce a constant for a unspecified node Christoph Lameter
2010-06-07 21:44   ` David Rientjes
2010-06-07 22:30     ` Christoph Lameter
2010-06-08  5:41       ` Pekka Enberg
2010-06-08  6:20         ` David Rientjes
2010-06-08  6:34           ` Pekka Enberg
2010-06-08 23:35             ` David Rientjes
2010-06-09  5:55               ` Pekka Enberg
2010-06-09  5:55                 ` Pekka Enberg
2010-06-09  6:20                 ` David Rientjes
2010-06-09  6:20                   ` David Rientjes
2010-05-21 21:14 ` [RFC V2 SLEB 02/14] SLUB: Constants need UL Christoph Lameter
2010-05-21 21:14 ` [RFC V2 SLEB 03/14] SLUB: Use kmem_cache flags to detect if Slab is in debugging mode Christoph Lameter
2010-06-08  3:57   ` David Rientjes
2010-05-21 21:14 ` [RFC V2 SLEB 04/14] SLUB: discard_slab_unlock Christoph Lameter
2010-05-21 21:14 ` [RFC V2 SLEB 05/14] SLUB: is_kmalloc_cache Christoph Lameter
2010-06-08  8:54   ` David Rientjes
2010-05-21 21:14 ` [RFC V2 SLEB 06/14] SLUB: Get rid of the kmalloc_node slab Christoph Lameter
2010-06-09  6:14   ` David Rientjes
2010-06-09 16:14     ` Christoph Lameter
2010-06-09 16:26       ` Pekka Enberg
2010-06-10  6:07         ` Pekka Enberg
2010-05-21 21:14 ` [RFC V2 SLEB 07/14] SLEB: The Enhanced Slab Allocator Christoph Lameter
2010-05-21 21:15 ` [RFC V2 SLEB 08/14] SLEB: Resize cpu queue Christoph Lameter
2010-05-21 21:15 ` [RFC V2 SLEB 09/14] SLED: Get rid of useless function Christoph Lameter
2010-05-21 21:15 ` [RFC V2 SLEB 10/14] SLEB: Remove MAX_OBJS limitation Christoph Lameter
2010-05-21 21:15 ` [RFC V2 SLEB 11/14] SLEB: Add per node cache (with a fixed size for now) Christoph Lameter
2010-05-21 21:15 ` [RFC V2 SLEB 12/14] SLEB: Make the size of the shared cache configurable Christoph Lameter
2010-05-21 21:15 ` Christoph Lameter [this message]
2010-05-21 21:15 ` [RFC V2 SLEB 14/14] SLEB: Allocate off node objects from remote shared caches Christoph Lameter
2010-05-22  8:37 ` [RFC V2 SLEB 00/14] The Enhanced(hopefully) Slab Allocator Pekka Enberg
2010-05-24  7:03 ` Nick Piggin
2010-05-24 15:06   ` Christoph Lameter
2010-05-25  2:06     ` Nick Piggin
2010-05-25  6:55       ` Pekka Enberg
2010-05-25  7:07         ` Nick Piggin
2010-05-25  8:03           ` Pekka Enberg
2010-05-25  8:03             ` Pekka Enberg
2010-05-25  8:16             ` Nick Piggin
2010-05-25  8:16               ` Nick Piggin
2010-05-25  9:19               ` Pekka Enberg
2010-05-25  9:19                 ` Pekka Enberg
2010-05-25  9:34                 ` Nick Piggin
2010-05-25  9:34                   ` Nick Piggin
2010-05-25  9:53                   ` Pekka Enberg
2010-05-25  9:53                     ` Pekka Enberg
2010-05-25 10:19                     ` Nick Piggin
2010-05-25 10:19                       ` Nick Piggin
2010-05-25 10:45                       ` Pekka Enberg
2010-05-25 10:45                         ` Pekka Enberg
2010-05-25 11:06                         ` Nick Piggin
2010-05-25 11:06                           ` Nick Piggin
2010-05-25 15:13                         ` Linus Torvalds
2010-05-25 15:13                           ` Linus Torvalds
2010-05-25 15:43                           ` Nick Piggin
2010-05-25 15:43                             ` Nick Piggin
2010-05-25 17:02                             ` Pekka Enberg
2010-05-25 17:02                               ` Pekka Enberg
2010-05-25 17:19                               ` Nick Piggin
2010-05-25 17:19                                 ` Nick Piggin
2010-05-25 17:35                                 ` Pekka Enberg
2010-05-25 17:35                                   ` Pekka Enberg
2010-05-25 17:40                                   ` Nick Piggin
2010-05-25 17:40                                     ` Nick Piggin
2010-05-25 10:07               ` David Rientjes
2010-05-25 10:07                 ` David Rientjes
2010-05-25 10:02             ` David Rientjes
2010-05-25 10:02               ` David Rientjes
2010-05-25 10:47               ` Pekka Enberg
2010-05-25 10:47                 ` Pekka Enberg
2010-05-25 19:57                 ` David Rientjes
2010-05-25 19:57                   ` David Rientjes
2010-05-25 14:13       ` Christoph Lameter
2010-05-25 14:34         ` Nick Piggin
2010-05-25 14:43           ` Nick Piggin
2010-05-25 14:48           ` Christoph Lameter
2010-05-25 15:11             ` Nick Piggin
2010-05-25 15:28               ` Christoph Lameter
2010-05-25 15:37                 ` Nick Piggin
2010-05-27 14:24                   ` Christoph Lameter
2010-05-27 14:37                     ` Nick Piggin
2010-05-27 15:52                       ` Christoph Lameter
2010-05-27 16:07                         ` Nick Piggin
2010-05-27 16:57                           ` Christoph Lameter
2010-05-28  8:39                             ` Nick Piggin
2010-05-25 14:40         ` Nick Piggin
2010-05-25 14:48           ` Christoph Lameter
2010-05-25 15:12             ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100521211544.756019063@quilx.com \
    --to=cl@linux.com \
    --cc=linux-mm@kvack.org \
    --cc=penberg@cs.helsinki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.