linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] mempool_alloc() pre-allocated object usage
@ 2005-10-03 14:36 Paul Mundt
  2005-10-03 14:49 ` Arjan van de Ven
  2005-10-03 14:59 ` Brian Gerst
  0 siblings, 2 replies; 7+ messages in thread
From: Paul Mundt @ 2005-10-03 14:36 UTC (permalink / raw)
  To: mingo; +Cc: linux-kernel

Currently mempool_create() will pre-allocate min_nr objects in the pool
for later usage. However, the current semantics of mempool_alloc() are to
first attempt the ->alloc() path and then fall back to using a
pre-allocated object that already exists in the pool.

This is somewhat of a problem if we want to build up a pool of relatively
high order allocations (backed with a slab cache for example) for
gauranteeing contiguity early on, as sometimes we are able to satisfy the
->alloc() path and end up growing the pool larger than we would like.

The easy way around this would be to first fetch objects out of the pool
and then try ->alloc() in the case where we have no free objects left in
the pool. ie:

diff --git a/mm/mempool.c b/mm/mempool.c
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -216,11 +216,6 @@ void * mempool_alloc(mempool_t *pool, un
 	gfp_temp = gfp_mask & ~(__GFP_WAIT|__GFP_IO);
 
 repeat_alloc:
-
-	element = pool->alloc(gfp_temp, pool->pool_data);
-	if (likely(element != NULL))
-		return element;
-
 	spin_lock_irqsave(&pool->lock, flags);
 	if (likely(pool->curr_nr)) {
 		element = remove_element(pool);
@@ -229,6 +224,10 @@ repeat_alloc:
 	}
 	spin_unlock_irqrestore(&pool->lock, flags);
 
+	element = pool->alloc(gfp_temp, pool->pool_data);
+	if (likely(element != NULL))
+		return element;
+
 	/* We must not sleep in the GFP_ATOMIC case */
 	if (!(gfp_mask & __GFP_WAIT))
 		return NULL;

The downside to this is that some people may be expecting that
pre-allocated elements are used as reserve space for when regular
allocations aren't possible. In which case, this would break that
behaviour.

Both usage patterns seem valid from my point of view, would you be open
to something that would accomodate both? (ie, possibly adding in a flag
to determine pre-allocated object usage?) Or should I not be using
mempool for contiguity purposes?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] mempool_alloc() pre-allocated object usage
  2005-10-03 14:36 [RFC] mempool_alloc() pre-allocated object usage Paul Mundt
@ 2005-10-03 14:49 ` Arjan van de Ven
  2005-10-03 16:21   ` Paul Mundt
  2005-10-04  6:59   ` Jens Axboe
  2005-10-03 14:59 ` Brian Gerst
  1 sibling, 2 replies; 7+ messages in thread
From: Arjan van de Ven @ 2005-10-03 14:49 UTC (permalink / raw)
  To: Paul Mundt; +Cc: mingo, linux-kernel

On Mon, 2005-10-03 at 17:36 +0300, Paul Mundt wrote:

> Both usage patterns seem valid from my point of view, would you be open
> to something that would accomodate both? (ie, possibly adding in a flag
> to determine pre-allocated object usage?) Or should I not be using
> mempool for contiguity purposes?

a similar dillema was in the highmem bounce code in 2.4; what worked
really well back then was to do it both; eg use half the pool for
"immediate" use, then try a VM alloc, and use the second half of the
pool for the really emergency cases.

Technically a mempool is there ONLY for the fallback, but I can see some
value in making it also a fastpath by means of a small scratch pool


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] mempool_alloc() pre-allocated object usage
  2005-10-03 14:36 [RFC] mempool_alloc() pre-allocated object usage Paul Mundt
  2005-10-03 14:49 ` Arjan van de Ven
@ 2005-10-03 14:59 ` Brian Gerst
  2005-10-04  1:06   ` Nick Piggin
  1 sibling, 1 reply; 7+ messages in thread
From: Brian Gerst @ 2005-10-03 14:59 UTC (permalink / raw)
  To: Paul Mundt; +Cc: mingo, linux-kernel

Paul Mundt wrote:
> The downside to this is that some people may be expecting that
> pre-allocated elements are used as reserve space for when regular
> allocations aren't possible. In which case, this would break that
> behaviour.

This is the original intent of the mempool.  There must be objects in 
reserve so that the machine doesn't deadlock on critical allocations 
(ie. disk writes) under memory pressure.

--
				Brian Gerst

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] mempool_alloc() pre-allocated object usage
  2005-10-03 14:49 ` Arjan van de Ven
@ 2005-10-03 16:21   ` Paul Mundt
  2005-10-05 21:35     ` Marcelo Tosatti
  2005-10-04  6:59   ` Jens Axboe
  1 sibling, 1 reply; 7+ messages in thread
From: Paul Mundt @ 2005-10-03 16:21 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: mingo, linux-kernel

On Mon, Oct 03, 2005 at 04:49:13PM +0200, Arjan van de Ven wrote:
> On Mon, 2005-10-03 at 17:36 +0300, Paul Mundt wrote:
> > Both usage patterns seem valid from my point of view, would you be open
> > to something that would accomodate both? (ie, possibly adding in a flag
> > to determine pre-allocated object usage?) Or should I not be using
> > mempool for contiguity purposes?
> 
> a similar dillema was in the highmem bounce code in 2.4; what worked
> really well back then was to do it both; eg use half the pool for
> "immediate" use, then try a VM alloc, and use the second half of the
> pool for the really emergency cases.
> 
Unfortunately this won't work very well in our case since it's
specifically high order allocations that we are after, and we don't have
the extra RAM to allow for this.

> Technically a mempool is there ONLY for the fallback, but I can see some
> value in making it also a fastpath by means of a small scratch pool

I haven't been able to think of any really good way to implement this, so
here's my current half-assed solution..

This adds a mempool_alloc_from_pool() to do the allocation directly from
the pool first if there are elements available, otherwise it defaults to
the mempool_alloc() behaviour (and no, I haven't commented it yet, since
it would be futile if no one likes this approach). It's at least fairly
minimalistic, and saves us from doing stupid things with the gfp_mask in
mempool_alloc().

--

 include/linux/mempool.h |    2 ++
 mm/mempool.c            |   16 ++++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/include/linux/mempool.h b/include/linux/mempool.h
--- a/include/linux/mempool.h
+++ b/include/linux/mempool.h
@@ -30,6 +30,8 @@ extern int mempool_resize(mempool_t *poo
 			unsigned int __nocast gfp_mask);
 extern void mempool_destroy(mempool_t *pool);
 extern void * mempool_alloc(mempool_t *pool, unsigned int __nocast gfp_mask);
+extern void * mempool_alloc_from_pool(mempool_t *pool,
+			unsigned int __nocast gfp_mask);
 extern void mempool_free(void *element, mempool_t *pool);
 
 /*
diff --git a/mm/mempool.c b/mm/mempool.c
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -246,6 +246,22 @@ repeat_alloc:
 }
 EXPORT_SYMBOL(mempool_alloc);
 
+void *mempool_alloc_from_pool(mempool_t *pool, unsigned int __nocast gfp_mask)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&pool->lock, flags);
+	if (likely(pool->curr_nr)) {
+		void *element = remove_element(pool);
+		spin_unlock_irqrestore(&pool->lock, flags);
+		return element;
+	}
+	spin_unlock_irqrestore(&pool->lock, flags);
+
+	return mempool_alloc(pool, gfp_mask);
+}
+EXPORT_SYMBOL(mempool_alloc_from_pool);
+
 /**
  * mempool_free - return an element to the pool.
  * @element:   pool element pointer.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] mempool_alloc() pre-allocated object usage
  2005-10-03 14:59 ` Brian Gerst
@ 2005-10-04  1:06   ` Nick Piggin
  0 siblings, 0 replies; 7+ messages in thread
From: Nick Piggin @ 2005-10-04  1:06 UTC (permalink / raw)
  To: Brian Gerst; +Cc: Paul Mundt, Ingo Molnar, lkml

On Mon, 2005-10-03 at 10:59 -0400, Brian Gerst wrote:
> Paul Mundt wrote:
> > The downside to this is that some people may be expecting that
> > pre-allocated elements are used as reserve space for when regular
> > allocations aren't possible. In which case, this would break that
> > behaviour.
> 
> This is the original intent of the mempool.  There must be objects in 
> reserve so that the machine doesn't deadlock on critical allocations 
> (ie. disk writes) under memory pressure.
> 

No, the semantics are that at least 'min' objects must be able to
be allocated at one time. The user must be able to proceed far enough
to release its objects in this case, and that ensures no deadlock.

The problem with using the pool first is that it requires the lock
to be taken and is also not NUMA aware. So from a scalability point of
view, I don't think it is a good idea.

Perhaps you could introduce a new mempool allocation interface to do
it your way?

Nick

-- 
SUSE Labs, Novell Inc.



Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] mempool_alloc() pre-allocated object usage
  2005-10-03 14:49 ` Arjan van de Ven
  2005-10-03 16:21   ` Paul Mundt
@ 2005-10-04  6:59   ` Jens Axboe
  1 sibling, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2005-10-04  6:59 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Paul Mundt, mingo, linux-kernel

On Mon, Oct 03 2005, Arjan van de Ven wrote:
> On Mon, 2005-10-03 at 17:36 +0300, Paul Mundt wrote:
> 
> > Both usage patterns seem valid from my point of view, would you be open
> > to something that would accomodate both? (ie, possibly adding in a flag
> > to determine pre-allocated object usage?) Or should I not be using
> > mempool for contiguity purposes?
> 
> a similar dillema was in the highmem bounce code in 2.4; what worked
> really well back then was to do it both; eg use half the pool for
> "immediate" use, then try a VM alloc, and use the second half of the
> pool for the really emergency cases.
> 
> Technically a mempool is there ONLY for the fallback, but I can see some
> value in making it also a fastpath by means of a small scratch pool

The reason it works the way it does is because of performance, you don't
want to touch the pool lock until you have to. If the page allocations
that happen before falling into the mempool, I would suggest looking at
that specific issue first. I think Nick recently did some changes in
that area, there might be more low hanging fruit.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] mempool_alloc() pre-allocated object usage
  2005-10-03 16:21   ` Paul Mundt
@ 2005-10-05 21:35     ` Marcelo Tosatti
  0 siblings, 0 replies; 7+ messages in thread
From: Marcelo Tosatti @ 2005-10-05 21:35 UTC (permalink / raw)
  To: Paul Mundt; +Cc: Arjan van de Ven, mingo, linux-kernel

Hi Paul,

On Mon, Oct 03, 2005 at 07:21:22PM +0300, Paul Mundt wrote:
> On Mon, Oct 03, 2005 at 04:49:13PM +0200, Arjan van de Ven wrote:
> > On Mon, 2005-10-03 at 17:36 +0300, Paul Mundt wrote:
> > > Both usage patterns seem valid from my point of view, would you be open
> > > to something that would accomodate both? (ie, possibly adding in a flag
> > > to determine pre-allocated object usage?) Or should I not be using
> > > mempool for contiguity purposes?
> > 
> > a similar dillema was in the highmem bounce code in 2.4; what worked
> > really well back then was to do it both; eg use half the pool for
> > "immediate" use, then try a VM alloc, and use the second half of the
> > pool for the really emergency cases.
> > 
> Unfortunately this won't work very well in our case since it's
> specifically high order allocations that we are after, and we don't have
> the extra RAM to allow for this.

Out of curiosity, what is the requirement for higher order pages?


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-10-06 12:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-10-03 14:36 [RFC] mempool_alloc() pre-allocated object usage Paul Mundt
2005-10-03 14:49 ` Arjan van de Ven
2005-10-03 16:21   ` Paul Mundt
2005-10-05 21:35     ` Marcelo Tosatti
2005-10-04  6:59   ` Jens Axboe
2005-10-03 14:59 ` Brian Gerst
2005-10-04  1:06   ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).