All of lore.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <Trond.Myklebust@netapp.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>, Nick Piggin <npiggin@kernel.dk>,
	Nick Bowler <nbowler@elliptictech.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-nfs@vger.kernel.org, Rik van Riel <riel@redhat.com>,
	Christoph Hellwig <hch@lst.de>, Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH v2 3/3] NFS: Fix a memory leak in nfs_readdir
Date: Wed, 01 Dec 2010 18:43:56 -0500	[thread overview]
Message-ID: <1291247036.6609.76.camel@heimdal.trondhjem.org> (raw)
In-Reply-To: <AANLkTi=x1bz5sJXOGERXh8iRHNZa+_gaYOyDecg3G92+@mail.gmail.com>

On Wed, 2010-12-01 at 15:31 -0800, Linus Torvalds wrote:
> On Wed, Dec 1, 2010 at 2:38 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > OK, the stop_machine() plugs a lot of potential race-vs-module-unload
> > things.  But Trond is referring to races against vmscan inode reclaim,
> > unmount, etc.
> 
> So?
> 
> A filesystem module cannot be unloaded while it's still mounted.
> 
> And unmount doesn't succeed until all inodes are gone.
> 
> And getting rid of an inode doesn't succeed until all pages associated
> with it are gone.
> 
> And getting rid of the pages involves locking them (whether in
> truncate or vmscan) and removing them from all lists.
> 
> Ergo: vmscan has a locked page leads to the filesystem being
> guaranteed to not be unmounted.  And that, in turn, guarantees that
> the module won't be unloaded until the machine has gone through an
> idle cycle.
> 
> It really is that simple. There's nothing subtle there. The reason
> spin_unlock(&mapping->tree_lock) is safe is exactly the above trivial
> chain of dependencies. And it's also exactly why
> mapping->a_ops->freepage() would also be safe.
> 
> This is pretty much how all the module races are handled. Doing module
> ref-counts per page (or per packet in flight for things like
> networking) would be prohibitively expensive. There's no way we can
> ever do that.

Although the page is locked, it may no longer be visible to the lockless
page lookup once the radix_tree_delete() completes in
__remove_from_page_cache.
Furthermore, if the same routine causes mapping->nr_pages to go to zero
before iput_final() hits truncate_inode_pages(), then the latter exits
immediately.

Both these cases would appear to allow iput_final() to release the inode
before vmscan gets round to unlocking the mapping->tree_lock since
truncate_inode_pages() no longer thinks it has any work to do.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com


  parent reply	other threads:[~2010-12-01 23:44 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-30 17:42 [PATCH] NFS: Fix a readdirplus bug Trond Myklebust
2010-11-30 22:10 ` Linus Torvalds
2010-11-30 22:13   ` Trond Myklebust
2010-12-01  3:47   ` [PATCH 0/3] Fix more NFS readdir regressions Trond Myklebust
2010-12-01  3:47   ` [PATCH 1/3] NFS: Ensure we use the correct cookie in nfs_readdir_xdr_filler Trond Myklebust
2010-12-01 15:04     ` Nick Bowler
2010-12-01 15:36       ` [PATCH v2 0/3] Fix more NFS readdir regressions Trond Myklebust
2010-12-01 15:36       ` [PATCH v2 1/3] NFS: Ensure we use the correct cookie in nfs_readdir_xdr_filler Trond Myklebust
2010-12-01 15:36       ` [PATCH v2 2/3] NFS: lock the readdir page while it is in use Trond Myklebust
2010-12-01 15:36       ` [PATCH v2 3/3] NFS: Fix a memory leak in nfs_readdir Trond Myklebust
2010-12-01 16:17         ` Linus Torvalds
2010-12-01 16:35           ` Rik van Riel
2010-12-01 16:45             ` Benny Halevy
2010-12-01 16:47             ` Linus Torvalds
2010-12-01 17:02               ` Rik van Riel
2010-12-01 17:58           ` Trond Myklebust
2010-12-01 18:29           ` Miklos Szeredi
2010-12-01 18:54           ` Trond Myklebust
2010-12-01 19:23             ` Hugh Dickins
2010-12-01 19:52               ` Linus Torvalds
2010-12-01 20:05               ` Trond Myklebust
2010-12-01 20:39                 ` Andrew Morton
2010-12-01 21:29                   ` Neil Brown
2010-12-01 22:43                     ` Andrew Morton
2010-12-01 23:01                       ` Neil Brown
2010-12-01 19:47             ` Linus Torvalds
2010-12-01 20:10               ` Trond Myklebust
2010-12-01 20:10                 ` Trond Myklebust
2010-12-01 20:18                 ` Linus Torvalds
2010-12-01 20:31                 ` Hugh Dickins
2010-12-01 20:33                 ` Andrew Morton
2010-12-01 21:02                   ` Hugh Dickins
2010-12-01 21:15                     ` Hugh Dickins
2010-12-01 21:38                       ` Andrew Morton
2010-12-01 21:51                         ` Trond Myklebust
2010-12-01 22:13                           ` Andrew Morton
2010-12-01 22:24                             ` Linus Torvalds
2010-12-01 22:38                               ` Andrew Morton
2010-12-01 22:47                                 ` Trond Myklebust
2010-12-01 23:21                                   ` Trond Myklebust
2010-12-01 23:21                                     ` Trond Myklebust
2010-12-01 23:46                                     ` Andrew Morton
2010-12-01 23:56                                       ` Trond Myklebust
2010-12-01 23:31                                 ` Linus Torvalds
2010-12-01 23:36                                   ` Andrew Morton
2010-12-02  1:05                                     ` Linus Torvalds
2010-12-02  1:22                                       ` Andrew Morton
2010-12-02  1:42                                         ` Linus Torvalds
2010-12-02  2:05                                           ` Andrew Morton
2010-12-02  3:08                                           ` [PATCH v3 0/3] Fix more NFS readdir regressions Trond Myklebust
2010-12-02  3:08                                           ` [PATCH v3 1/3] NFS: Ensure we use the correct cookie in nfs_readdir_xdr_filler Trond Myklebust
2010-12-02  3:08                                           ` [PATCH v3 2/3] Call the filesystem back whenever a page is removed from the page cache Trond Myklebust
2010-12-02  3:34                                             ` Hugh Dickins
2010-12-02  3:53                                               ` Trond Myklebust
2010-12-02  3:58                                                 ` Linus Torvalds
2010-12-06 16:59                                                   ` [PATCH v4 0/3] Fix more NFS readdir regressions Trond Myklebust
2010-12-06 16:59                                                   ` [PATCH v4 1/3] NFS: Ensure we use the correct cookie in nfs_readdir_xdr_filler Trond Myklebust
2010-12-06 16:59                                                   ` [PATCH v4 2/3] Call the filesystem back whenever a page is removed from the page cache Trond Myklebust
2010-12-07  7:08                                                     ` Nick Piggin
2010-12-06 16:59                                                   ` [PATCH v4 3/3] NFS: Fix a memory leak in nfs_readdir Trond Myklebust
2010-12-02  3:08                                           ` [PATCH v3 " Trond Myklebust
2010-12-03  9:12                                           ` [PATCH v2 " Nick Piggin
2010-12-01 23:43                                   ` Trond Myklebust [this message]
2010-12-01 22:43                               ` Hugh Dickins
2010-12-01  3:47   ` [PATCH 2/3] NFS: lock the readdir page while it is in use Trond Myklebust
2010-12-01  4:10     ` Linus Torvalds
2010-12-01  4:29       ` Trond Myklebust
2010-12-01  5:06         ` Linus Torvalds
2010-12-01 14:49           ` Trond Myklebust
2010-12-01 13:14         ` Rik van Riel
2010-12-01 14:55           ` Trond Myklebust
2010-12-01  3:47   ` [PATCH 3/3] NFS: Fix a memory leak in nfs_readdir Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1291247036.6609.76.camel@heimdal.trondhjem.org \
    --to=trond.myklebust@netapp.com \
    --cc=akpm@linux-foundation.org \
    --cc=hch@lst.de \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=nbowler@elliptictech.com \
    --cc=npiggin@kernel.dk \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.