linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Lameter <cl@linux-foundation.org>,
	Arjan van de Ven <arjan@infradead.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Peter Zijlstra <peterz@infradead.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"minchan.kim@gmail.com" <minchan.kim@gmail.com>,
	"hugh.dickins" <hugh.dickins@tiscali.co.uk>,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: [RFC][PATCH 6/8] mm: handle_speculative_fault()
Date: Thu, 7 Jan 2010 08:34:18 -0800	[thread overview]
Message-ID: <20100107163418.GA6764@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.LFD.2.00.1001070814080.7821@localhost.localdomain>

On Thu, Jan 07, 2010 at 08:19:56AM -0800, Linus Torvalds wrote:
> 
> 
> On Thu, 7 Jan 2010, Christoph Lameter wrote:
> > >
> > > depends on the workload; on a many-threads-java workload, you also get
> > > it for write quite a bit (lots of malloc/frees in userspace in addition
> > > to pagefaults).. at which point you do end up serializing on the
> > > zeroing.
> > >
> > > There's some real life real big workloads that show this pretty badly;
> > > so far the workaround is to have glibc batch up a lot of the free()s..
> > > but that's just pushing it a little further out.
> > 
> > Again mmap_sem is a rwsem and only a read lock is held. Zeroing in
> > do_anonymous_page can occur concurrently on multiple processors in the
> > same address space. The pte lock is intentionally taken *after* zeroing to
> > allow concurrent zeroing to occur.
> 
> You're missing what Arjan said - the jav workload does a lot of memory 
> allocations too, causing mmap/munmap.
> 
> So now some paths are indeed holding it for writing (or need to wait for 
> it to become writable). And the fairness of rwsems quite possibly then 
> impacts throughput a _lot_..
> 
> (Side note: I wonder if we should wake up _all_ readers when we wake up 
> any. Right now, we wake up all readers - but only until we hit a writer. 
> Which is the _fair_ thing to do, but it does mean that we can end up in 
> horrible patterns of alternating readers/writers, when it could be much 
> better to just say "release the hounds" and let all pending readers go 
> after a writer has had its turn).

This can indeed work well in many cases.  The situation where it can
get you in trouble is where there are many more readers than CPUs (or
disk spindles or whatever it is that limits the amount of effective
parallelism readers can attain).  In this case, releasing more readers
than can run in parallel will delay the writers for no good reason.

So one strategy is to release readers, but no more than the number of
CPUs (or whatever the limit is).  More complicated strategies are out
there, but there is a limit to how much of the scheduler one should
involve in lock-granting decisions.

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-01-07 16:34 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-04 18:24 [RFC][PATCH 0/8] Speculative pagefault -v3 Peter Zijlstra
2010-01-04 18:24 ` [RFC][PATCH 1/8] mm: Remove pte reference from fault path Peter Zijlstra
2010-01-04 18:24 ` [RFC][PATCH 2/8] mm: Speculative pagefault infrastructure Peter Zijlstra
2010-01-04 18:24 ` [RFC][PATCH 3/8] mm: Add vma sequence count Peter Zijlstra
2010-01-04 18:24 ` [RFC][PATCH 4/8] mm: RCU free vmas Peter Zijlstra
2010-01-05  2:43   ` Paul E. McKenney
2010-01-05  8:28     ` Peter Zijlstra
2010-01-05 16:05       ` Paul E. McKenney
2010-01-04 18:24 ` [RFC][PATCH 5/8] mm: Speculative pte_map_lock() Peter Zijlstra
2010-01-04 18:24 ` [RFC][PATCH 6/8] mm: handle_speculative_fault() Peter Zijlstra
2010-01-05  0:25   ` KAMEZAWA Hiroyuki
2010-01-05  3:13     ` Linus Torvalds
2010-01-05  8:17       ` Peter Zijlstra
2010-01-05  8:57       ` Peter Zijlstra
2010-01-05 15:34         ` Linus Torvalds
2010-01-05 15:40           ` Al Viro
2010-01-05 16:10             ` Linus Torvalds
2010-01-06 15:41               ` Peter Zijlstra
2010-01-05  9:37       ` Peter Zijlstra
2010-01-05 23:35         ` Linus Torvalds
2010-01-05  4:29     ` Minchan Kim
2010-01-05  4:43       ` KAMEZAWA Hiroyuki
2010-01-05  5:10         ` Linus Torvalds
2010-01-05  5:30           ` KAMEZAWA Hiroyuki
2010-01-05  7:39             ` KAMEZAWA Hiroyuki
2010-01-05 15:26               ` Linus Torvalds
2010-01-05 16:14                 ` Linus Torvalds
2010-01-05 17:25                   ` Andi Kleen
2010-01-05 17:47                     ` Christoph Lameter
2010-01-05 18:00                       ` Andi Kleen
2010-01-05 17:55                     ` Linus Torvalds
2010-01-05 18:13                       ` Christoph Lameter
2010-01-05 18:25                         ` Linus Torvalds
2010-01-05 18:46                           ` Christoph Lameter
2010-01-05 18:56                             ` Linus Torvalds
2010-01-05 19:15                               ` Christoph Lameter
2010-01-05 19:28                                 ` Linus Torvalds
2010-01-05 18:55                           ` Paul E. McKenney
2010-01-05 19:08                             ` Linus Torvalds
2010-01-05 19:23                               ` Paul E. McKenney
2010-01-05 20:29                           ` Peter Zijlstra
2010-01-05 20:46                             ` Linus Torvalds
2010-01-05 21:00                               ` Linus Torvalds
2010-01-05 23:29                             ` Paul E. McKenney
2010-01-06  0:22                 ` KAMEZAWA Hiroyuki
2010-01-06  1:37                   ` Linus Torvalds
2010-01-06  2:52                     ` KAMEZAWA Hiroyuki
2010-01-06  3:27                       ` Linus Torvalds
2010-01-06  3:56                         ` KAMEZAWA Hiroyuki
2010-01-06  4:20                           ` Linus Torvalds
2010-01-06  7:06                             ` KAMEZAWA Hiroyuki
2010-01-06  7:49                               ` Minchan Kim
2010-01-06  9:39                               ` Linus Torvalds
2010-01-07  1:00                                 ` KAMEZAWA Hiroyuki
2010-01-08 16:53                             ` Peter Zijlstra
2010-01-08 17:22                               ` Linus Torvalds
2010-01-08 17:43                                 ` Christoph Lameter
2010-01-08 17:52                                   ` Linus Torvalds
2010-01-08 18:33                                     ` Christoph Lameter
2010-01-08 18:46                                   ` Andi Kleen
2010-01-08 18:56                                     ` Christoph Lameter
2010-01-08 19:10                                       ` Andi Kleen
2010-01-08 19:11                                       ` Linus Torvalds
2010-01-08 19:28                                         ` Andi Kleen
2010-01-08 19:39                                           ` Linus Torvalds
2010-01-08 19:42                                             ` Linus Torvalds
2010-01-08 21:36                                   ` Linus Torvalds
2010-01-08 21:46                                     ` Christoph Lameter
2010-01-08 22:43                                       ` Linus Torvalds
2010-01-08 22:43                                       ` Linus Torvalds
2010-01-09 14:47                               ` Ed Tomlinson
2010-01-10  5:27                                 ` Nitin Gupta
2010-01-05 15:14             ` Christoph Lameter
2010-01-05  8:18           ` Peter Zijlstra
2010-01-05  6:00         ` Minchan Kim
2010-01-05  4:48       ` Linus Torvalds
2010-01-05  6:09         ` Minchan Kim
2010-01-05  6:09           ` KAMEZAWA Hiroyuki
2010-01-05  6:24             ` Minchan Kim
2010-01-05  8:35           ` Peter Zijlstra
2010-01-05 13:45   ` Arjan van de Ven
2010-01-05 14:15     ` Andi Kleen
2010-01-05 15:17     ` Christoph Lameter
2010-01-06  3:22       ` Arjan van de Ven
2010-01-07 16:11         ` Christoph Lameter
2010-01-07 16:19           ` Linus Torvalds
2010-01-07 16:31             ` Linus Torvalds
2010-01-07 16:34             ` Paul E. McKenney [this message]
2010-01-07 16:36             ` Christoph Lameter
2010-01-08  4:49               ` Arjan van de Ven
2010-01-08  5:00                 ` Linus Torvalds
2010-01-08 15:51                 ` Christoph Lameter
2010-01-09 15:55                   ` Arjan van de Ven
2010-01-07 17:22             ` Peter Zijlstra
2010-01-07 17:36               ` Linus Torvalds
2010-01-07 17:49                 ` Linus Torvalds
2010-01-07 18:00                   ` Peter Zijlstra
2010-01-07 18:15                     ` Linus Torvalds
2010-01-07 21:49                       ` Peter Zijlstra
2010-01-07 18:44                   ` Linus Torvalds
2010-01-07 19:20                     ` Paul E. McKenney
2010-01-07 20:06                       ` Linus Torvalds
2010-01-07 20:25                         ` Paul E. McKenney
2010-01-07 19:24                     ` Christoph Lameter
2010-01-07 20:08                       ` Linus Torvalds
2010-01-07 20:13                         ` Linus Torvalds
2010-01-07 21:44                     ` Peter Zijlstra
2010-01-07 22:33                       ` Linus Torvalds
2010-01-08  0:23                         ` KAMEZAWA Hiroyuki
2010-01-08  0:25                           ` KAMEZAWA Hiroyuki
2010-01-08  0:39                           ` Linus Torvalds
2010-01-08  0:41                             ` Linus Torvalds
2010-01-07 23:51                 ` Rik van Riel
2010-01-04 18:24 ` [RFC][PATCH 7/8] mm,x86: speculative pagefault support Peter Zijlstra
2010-01-04 18:24 ` [RFC][PATCH 8/8] mm: Optimize pte_map_lock() Peter Zijlstra
2010-01-04 21:41 ` [RFC][PATCH 0/8] Speculative pagefault -v3 Rik van Riel
2010-01-04 21:46   ` Peter Zijlstra
2010-01-04 23:20     ` Rik van Riel
2010-01-04 21:59   ` Christoph Lameter
2010-01-05  0:28     ` KAMEZAWA Hiroyuki
2010-01-05  2:26 ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100107163418.GA6764@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=arjan@infradead.org \
    --cc=cl@linux-foundation.org \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).