linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Dobson <colpatch@us.ibm.com>
To: Paul Jackson <pj@sgi.com>
Cc: clameter@engr.sgi.com, linux-kernel@vger.kernel.org,
	sri@us.ibm.com, andrea@suse.de, pavel@suse.cz,
	linux-mm@kvack.org
Subject: Re: [patch 3/9] mempool - Make mempools NUMA aware
Date: Fri, 27 Jan 2006 17:00:18 -0800	[thread overview]
Message-ID: <43DAC222.4060805@us.ibm.com> (raw)
In-Reply-To: <20060127025126.c95f8002.pj@sgi.com>

Paul Jackson wrote:
> Matthew wrote:
> 
>>I'm glad we're on the same page now. :)  And yes, adding four "duplicate"
>>*_mempool allocators was not my first choice, but I couldn't easily see a
>>better way.
> 
> 
> I hope the following comments aren't too far off target.
> 
> I too am inclined to prefer the __GFP_CRITICAL approach over this.

OK.  Chalk one more up for that solution...


> That or Andrea's suggestion, which except for a free hook, was entirely
> outside of the page_alloc.c code paths.

This is supposed to be an implementation of Andrea's suggestion.  There are
no hooks in ANY page_alloc.c code paths.  These patches touch mempool code
and some slab code, but not any page allocator code.


> Or Alan's suggested revival
> of the old code to drop non-critical network patches in duress.

Dropping non-critical packets is still in our plan, but I don't think that
is a FULL solution.  As we mentioned before on that topic, you can't tell
if a packet is critical until AFTER you receive it, by which point it has
already had an skbuff (hopefully) allocated for it.  If your network
traffic is coming in faster than you can receive, examine, and drop
non-critical packets you're hosed.  I still think some sort of reserve pool
is necessary to give the networking stack a little breathing room when
under both memory pressure and network load.


> I am tempted to think you've taken an approach that raised some
> substantial looking issues:
> 
>  * how to tell the system when to use the emergency pool

We've dropped the whole "in_emergency" thing.  The system uses the
emergency pool when the normal pool (ie: the buddy allocator) is out of pages.

>  * this doesn't really solve the problem (network can still starve)

Only if the pool is not large enough.  One can argue that sizing the pool
appropriately is impossible (theoretical incoming traffic over a GigE card
or two for a minute or two is extremely large), but then I guess we
shouldn't even try to fix the problem...?

>  * it wastes memory most of the time

True.  Any "real" reserve system will suffer from that problem.  Ben
LaHaise suggested a reserve system that allows the reserve pages to be used
for trivially reclaimable allocation while not in active use.  An
interesting idea.  Regardless, the Linux VM sorta already wastes memory by
keeping min_free_kbytes around, no?

>  * it doesn't really improve on GFP_ATOMIC

I disagree.  It improves on GFP_ATOMIC by giving it a second chance.  If
you've got a GFP_ATOMIC allocation that is particularly critical, using a
mempool to back it means that you can keep going for a while when the rest
of the system OOMs/goes into SWAP hell/etc.

> and just added another substantial looking issue:
> 
>  * it entwines another thread of complexity and performance costs
>    into the important memory allocation code path.

I can't say that it doesn't add any complexity into an important memory
allocation path, but I don't think it is a significant amount of
complexity.  It is just a pointer check in kmem_getpages()...


>>With large machines, especially as
>>those large machines' workloads are more and more likely to be partitioned
>>with something like cpusets, you want to be able to specify where you want
>>your reserve pool to come from.
> 
> 
> Cpusets is about performance, not correctness.  Anytime I get cornered
> in the cpuset code, I prefer violating the cpuset containment, over
> serious system failure.

Fair enough.  But if we can keep the same baseline performance and add this
new feature, I'd like to do that.  Doing our best to allocate on a
particular node when requested to isn't too much to ask.

-Matt

  reply	other threads:[~2006-01-28  1:00 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20060125161321.647368000@localhost.localdomain>
2006-01-25 19:39 ` [patch 1/9] mempool - Add page allocator Matthew Dobson
2006-01-25 19:39 ` [patch 2/9] mempool - Use common mempool " Matthew Dobson
2006-01-25 19:40 ` [patch 4/9] mempool - Update mempool page allocator user Matthew Dobson
2006-01-25 19:40 ` [patch 5/9] mempool - Update kmalloc mempool users Matthew Dobson
2006-01-25 19:40 ` [patch 6/9] mempool - Update kzalloc " Matthew Dobson
2006-01-26  7:30   ` Pekka Enberg
2006-01-26 22:03     ` Matthew Dobson
2006-01-25 19:40 ` [patch 8/9] slab - Add *_mempool slab variants Matthew Dobson
2006-01-26  7:41   ` Pekka Enberg
2006-01-26 22:40     ` Matthew Dobson
2006-01-27  7:09       ` Pekka J Enberg
2006-01-27  7:10       ` Pekka J Enberg
2006-01-25 19:40 ` [patch 9/9] slab - Implement single mempool backing for slab allocator Matthew Dobson
2006-01-26  8:11   ` Pekka Enberg
2006-01-26 22:48     ` Matthew Dobson
2006-01-27  7:22       ` Pekka J Enberg
2006-01-25 23:51 ` [patch 1/9] mempool - Add page allocator Matthew Dobson
2006-01-25 23:51 ` [patch 3/9] mempool - Make mempools NUMA aware Matthew Dobson
2006-01-26 17:54   ` Christoph Lameter
2006-01-26 22:57     ` Matthew Dobson
2006-01-26 23:15       ` Christoph Lameter
2006-01-26 23:24         ` Matthew Dobson
2006-01-26 23:29           ` Christoph Lameter
2006-01-27  0:15             ` Matthew Dobson
2006-01-27  0:21               ` Christoph Lameter
2006-01-27  0:34                 ` Matthew Dobson
2006-01-27  0:39                   ` Christoph Lameter
2006-01-27  0:44                     ` Matthew Dobson
2006-01-27  0:57                       ` Christoph Lameter
2006-01-27  1:07                         ` Andi Kleen
2006-01-27 10:51                   ` Paul Jackson
2006-01-28  1:00                     ` Matthew Dobson [this message]
2006-01-28  5:08                       ` Paul Jackson
2006-01-28  8:16                       ` Pavel Machek
2006-01-28 16:14                         ` Sridhar Samudrala
2006-01-28 16:41                           ` Pavel Machek
2006-01-28 16:53                             ` Sridhar Samudrala
2006-01-28 22:59                               ` Pavel Machek
2006-01-28 23:10                                 ` Let the flames begin... [was Re: [patch 3/9] mempool - Make mempools NUMA aware] Pavel Machek
2006-01-27  0:23   ` [patch 3/9] mempool - Make mempools NUMA aware Benjamin LaHaise
2006-01-27  0:35     ` Matthew Dobson
2006-01-27  3:23       ` Benjamin LaHaise
2006-01-28  1:08         ` Matthew Dobson
2006-01-25 23:51 ` [patch 7/9] mempool - Update other mempool users Matthew Dobson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43DAC222.4060805@us.ibm.com \
    --to=colpatch@us.ibm.com \
    --cc=andrea@suse.de \
    --cc=clameter@engr.sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pavel@suse.cz \
    --cc=pj@sgi.com \
    --cc=sri@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).