All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hugh Dickins <hugh@veritas.com>
To: Andrew Morton <akpm@osdl.org>
Cc: Andrea Arcangeli <andrea@suse.de>,
	nickpiggin@yahoo.com.au, linux-kernel@vger.kernel.org
Subject: Re: smp race fix between invalidate_inode_pages* and do_no_page
Date: Fri, 7 Apr 2006 20:18:18 +0100 (BST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0604071945200.16361@blonde.wat.veritas.com> (raw)
In-Reply-To: <20060401212138.3b48f634.akpm@osdl.org>

On Sat, 1 Apr 2006, Andrew Morton wrote:
> Andrea Arcangeli <andrea@suse.de> wrote:
> > 
> > The new PG_truncate bitflag can be used as well to eliminate the
> > truncate_count from all vmas etc... that would save substantial memory
> > and remove some complexity, truncate_count grown a lot since the time we
> > introduced it.

truncate_count is playing a useful role with the vma tree, allowing
us to find our place again after allowing preemption in.  Though I'm
sometimes wondering whether just to make the vma _bigger_ and add a
list instead of that truncate_count: construct a list of vmas from
the prio_tree for unmap_mapping_range, so it can keep place in that,
instead of having to restart the tree whenever preempted.  But I
think NFS doesn't hold i_sem across unmapping, so it wouldn't work.

> > The PG_truncate is needed as well because we can't know in do_no_page if
> > page->mapping is legitimate null or not (think bttv and other device
> > drivers returning page->mapping null because they're private but not
> > reserved pages etc..)

That part is easily dealt with, as Nick and I have suggested,
just by marking vmas liable to having their pages truncated.
But that's certainly no more than one part of a larger solution.

> The patch will pretty clearly fix that.  But it really would be better to
> place all the cost over on the invalidate side, rather than adding overhead
> to the pagefault path.

I see ~2% slowdown in relevant faulting tests e.g. the lmbench ones.
I'd certainly prefer not to be locking pages here: or not without
word from the high-scalability people that Andrea's patch really in
practice doesn't hurt them (but the ~2% I see is on UP as much as MP).

> Could we perhaps check the page_count(page) and/or page_mapped(page) in
> invalidate_complete_page(), when we have tree_lock?

I'm thinking on it, but not finding it easy.  And I've just realized
that invalidate_inode_pages2 isn't the only problematic case here
(though agreed the more urgent one): MADV_REMOVE likewise needs to
invalidate pages without adjusting i_size, so also cannot rely on
the truncate_count/i_size method.

> Do you have handy any test code which others can use to recreate this bug?

I'd be glad of that too.

Hugh

  reply	other threads:[~2006-04-07 19:18 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-12-13 19:37 smp race fix between invalidate_inode_pages* and do_no_page Andrea Arcangeli
2005-12-13 21:02 ` Andrew Morton
2005-12-13 21:14   ` Andrea Arcangeli
2005-12-16 13:51     ` Andrea Arcangeli
2006-01-10  6:24       ` Andrea Arcangeli
2006-01-10  6:48         ` Andrea Arcangeli
2006-01-11  4:08         ` Nick Piggin
2006-01-11  8:23           ` Andrea Arcangeli
2006-01-11  8:51             ` Andrew Morton
2006-01-11  9:02               ` Andrea Arcangeli
2006-01-11  9:06                 ` Andrew Morton
2006-01-11  9:13                   ` Andrea Arcangeli
2006-01-11 20:49                     ` Hugh Dickins
2006-01-11 21:05                       ` Andrew Morton
2006-01-13  7:35                       ` Nick Piggin
2006-01-13  7:47                         ` Andrew Morton
2006-01-13 10:37                           ` Nick Piggin
2006-03-31 12:36                             ` Andrea Arcangeli
2006-04-02  5:17                               ` Nick Piggin
2006-04-02  5:21                               ` Andrew Morton
2006-04-07 19:18                                 ` Hugh Dickins [this message]
2006-01-11  9:39                 ` Nick Piggin
2006-01-11  9:34             ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0604071945200.16361@blonde.wat.veritas.com \
    --to=hugh@veritas.com \
    --cc=akpm@osdl.org \
    --cc=andrea@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nickpiggin@yahoo.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.