linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Christoph Lameter <cl@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>,
	Linux-Netdev <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	David Miller <davem@davemloft.net>, Neil Brown <neilb@suse.de>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Pekka Enberg <penberg@cs.helsinki.fi>
Subject: Re: [PATCH 02/15] mm: sl[au]b: Add knowledge of PFMEMALLOC reserve pages
Date: Wed, 8 Feb 2012 21:23:23 +0000	[thread overview]
Message-ID: <20120208212323.GM5938@suse.de> (raw)
In-Reply-To: <alpine.DEB.2.00.1202081338210.32060@router.home>

On Wed, Feb 08, 2012 at 01:49:05PM -0600, Christoph Lameter wrote:
> On Wed, 8 Feb 2012, Mel Gorman wrote:
> 
> > Ok, I looked into what is necessary to replace these with checking a page
> > flag and the cost shifts quite a bit and ends up being more expensive.
> 
> That is only true if you go the slab route.

Well, yes but both slab and slub have to be supported. I see no reason
why I would choose to make this a slab-only or slub-only feature. Slob is
not supported because it's not expected that a platform using slob is also
going to use network-based swap.

> Slab suffers from not having
> the page struct pointer readily available. The changes are likely already
> impacting slab performance without the virt_to_page patch.
> 

The performance impact only comes into play when swap is on a network
device and pfmemalloc reserves are in use. The rest of the time the check
on ac avoids all the cost and there is a micro-optimisation later to avoid
calling a function (patch 12).

> > In slub, it's sufficient to check kmem_cache_cpu to know whether the
> > objects in the list are pfmemalloc or not.
> 
> We try to minimize the size of kmem_cache_cpu. The page pointer is readily
> available. We just removed the node field from kmem_cache_cpu because it
> was less expensive to get the node number from the struct page field.
> 
> The same is certainly true for a PFMEMALLOC flag.
> 

Ok, are you asking that I use the page flag for slub and leave kmem_cache_cpu
alone in the slub case? I can certainly check it out if that's what you
are asking for.

> > Yeah, you're right on the button there. I did my checking assuming that
> > PG_active+PG_slab were safe to use. The following is an untested patch that
> > I probably got details wrong in but it illustrates where virt_to_page()
> > starts cropping up.
> 
> Yes you need to come up with a way to not use virt_to_page otherwise slab
> performance is significantly impacted.

I did come up with a way: the necessary information is in ac and slabp
on slab :/ . There are not exactly many ways that the information can
be recorded.

> On NUMA we are already doing a page struct lookup on free in slab.
> If you would save the page struct pointer
> there and reuse it then you would not have an issue at least on free.
> 

That information is only available on NUMA and only when there is more than
one node. Having cache_free_alien return the page for passing to ac_put_obj()
would also be ugly. The biggest downfall by far is that single-node machines
incur the cost of virt_to_page() where they did not have to before. This
is not a solution and it is not better than the current simply check on
a struct field.

> You still would need to determine which "struct slab" pointer is in use
> which will also require similar lookups in varous places.
> 
> Transfer of the pfmemalloc flags (guess you must have a pfmemalloc
> field in struct slab then) in slab is best be done when allocating and
> freeing a slab page from the page allocator.
> 

The page->pfmemalloc is already been transferred to the slab in
cache_grow.

> I think its rather trivial to add the support you want in a non intrusive
> way to slub. Slab would require some more thought and discussion.
> 

I'm slightly confused by this sentence. Support for slub is already in the
patch and as you say, it's fairly straight-forward. Supporting a page flag
and leaving kmem_cache_cpu alone may also be easier as kmem_cache_cpu->page
can be used instead of a kmem_cache_cpu->pfmemalloc field.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2012-02-08 21:23 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-06 22:56 [PATCH 00/15] Swap-over-NBD without deadlocking V8 Mel Gorman
2012-02-06 22:56 ` [PATCH 01/15] mm: Serialize access to min_free_kbytes Mel Gorman
2012-02-08 18:47   ` Rik van Riel
2012-02-06 22:56 ` [PATCH 02/15] mm: sl[au]b: Add knowledge of PFMEMALLOC reserve pages Mel Gorman
2012-02-07 16:27   ` Christoph Lameter
2012-02-08 14:45     ` Mel Gorman
2012-02-08 15:14       ` Christoph Lameter
2012-02-08 16:34         ` Mel Gorman
2012-02-08 19:49           ` Christoph Lameter
2012-02-08 21:23             ` Mel Gorman [this message]
2012-02-08 22:13               ` Christoph Lameter
2012-02-09 12:50                 ` Mel Gorman
2012-02-09 19:53                   ` Christoph Lameter
2012-02-10 10:26                     ` Mel Gorman
2012-02-10 21:01                       ` Christoph Lameter
2012-02-10 22:07                         ` Christoph Lameter
2012-02-13 10:12                           ` Mel Gorman
2012-02-13 11:10                         ` Mel Gorman
2012-02-06 22:56 ` [PATCH 03/15] mm: Introduce __GFP_MEMALLOC to allow access to emergency reserves Mel Gorman
2012-02-06 22:56 ` [PATCH 04/15] mm: allow PF_MEMALLOC from softirq context Mel Gorman
2012-02-06 22:56 ` [PATCH 05/15] mm: Ignore mempolicies when using ALLOC_NO_WATERMARK Mel Gorman
2012-02-06 22:56 ` [PATCH 06/15] net: Introduce sk_allocation() to allow addition of GFP flags depending on the individual socket Mel Gorman
2012-02-06 22:56 ` [PATCH 07/15] netvm: Allow the use of __GFP_MEMALLOC by specific sockets Mel Gorman
2012-02-06 22:56 ` [PATCH 08/15] netvm: Allow skb allocation to use PFMEMALLOC reserves Mel Gorman
2012-02-06 22:56 ` [PATCH 09/15] netvm: Propagate page->pfmemalloc to skb Mel Gorman
2012-02-06 22:56 ` [PATCH 10/15] netvm: Propagate page->pfmemalloc from netdev_alloc_page " Mel Gorman
2012-02-07 23:38   ` Alexander Duyck
2012-02-08 15:23     ` Mel Gorman
2012-02-06 22:56 ` [PATCH 11/15] netvm: Set PF_MEMALLOC as appropriate during SKB processing Mel Gorman
2012-02-06 22:56 ` [PATCH 12/15] mm: Micro-optimise slab to avoid a function call Mel Gorman
2012-02-06 22:56 ` [PATCH 13/15] nbd: Set SOCK_MEMALLOC for access to PFMEMALLOC reserves Mel Gorman
2012-02-06 22:56 ` [PATCH 14/15] mm: Throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage Mel Gorman
2012-02-06 22:56 ` [PATCH 15/15] mm: Account for the number of times direct reclaimers get throttled Mel Gorman
2012-02-07 12:45 ` [PATCH 00/15] Swap-over-NBD without deadlocking V8 Hillf Danton
2012-02-07 13:27   ` Mel Gorman
2012-02-08 12:51     ` Hillf Danton
2012-02-08 15:26       ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120208212323.GM5938@suse.de \
    --to=mgorman@suse.de \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=neilb@suse.de \
    --cc=netdev@vger.kernel.org \
    --cc=penberg@cs.helsinki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).