All of lore.kernel.org
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>,
	Hugh Dickins <hughd@google.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Nick Bowler <nbowler@elliptictech.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-nfs@vger.kernel.org, Rik van Riel <riel@redhat.com>,
	Christoph Hellwig <hch@lst.de>, Al Viro <viro@zeniv.linux.org.uk>,
	Nick Piggin <npiggin@kernel.dk>
Subject: Re: [PATCH v2 3/3] NFS: Fix a memory leak in nfs_readdir
Date: Thu, 2 Dec 2010 08:29:13 +1100	[thread overview]
Message-ID: <20101202082913.7cb98444@notabene.brown> (raw)
In-Reply-To: <20101201123929.ab7cef1d.akpm@linux-foundation.org>

On Wed, 1 Dec 2010 12:39:29 -0800 Andrew Morton <akpm@linux-foundation.org>
wrote:

> On Wed, 01 Dec 2010 15:05:38 -0500
> Trond Myklebust <Trond.Myklebust@netapp.com> wrote:
> 
> > On Wed, 2010-12-01 at 11:23 -0800, Hugh Dickins wrote:
> > > On Wed, 1 Dec 2010, Trond Myklebust wrote:
> > > > On Wed, 2010-12-01 at 08:17 -0800, Linus Torvalds wrote:
> > > > > include/linux/fs.h |    1 +
> > > > >  mm/vmscan.c        |    3 +++
> > > > >  2 files changed, 4 insertions(+), 0 deletions(-)
> > > > > 
> > > > > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > > > > index c9e06cc..090f0ea 100644
> > > > > --- a/include/linux/fs.h
> > > > > +++ b/include/linux/fs.h
> > > > > @@ -602,6 +602,7 @@ struct address_space_operations {
> > > > >         sector_t (*bmap)(struct address_space *, sector_t);
> > > > >         void (*invalidatepage) (struct page *, unsigned long);
> > > > >         int (*releasepage) (struct page *, gfp_t);
> > > > > +       void (*freepage)(struct page *);
> > > > >         ssize_t (*direct_IO)(int, struct kiocb *, const struct iovec
> > > > > *iov,
> > > > >                         loff_t offset, unsigned long nr_segs);
> > > > >         int (*get_xip_mem)(struct address_space *, pgoff_t, int,
> > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > > > index d31d7ce..1accb01 100644
> > > > > --- a/mm/vmscan.c
> > > > > +++ b/mm/vmscan.c
> > > > > @@ -499,6 +499,9 @@ static int __remove_mapping(struct address_space
> > > > > *mapping, struct page *page)
> > > > >                 mem_cgroup_uncharge_cache_page(page);
> > > > >         }
> > > > >  
> > > > > +       if (mapping->a_ops->freepage)
> > > > > +               mapping->a_ops->freepage(page);
> > > > 
> > > > Hmm... Looking again at the problem, it appears that the same callback
> > > > needs to be added to truncate_complete_page() and
> > > > invalidate_complete_page2(). Otherwise we end up in a situation where
> > > > the page can sometimes be removed from the page cache without calling
> > > > freepage().
> > > > 
> > > > > +
> > > > >         return 1;
> > > > >  
> > > > >  cannot_free:
> > > 
> > > Yes, I was wondering quite how we would define this ->freepage thing,
> > > if it gets called from one place that removes from page cache and not
> > > from others.
> > > 
> > > Another minor problem with it: it would probably need to take the
> > > struct address_space *mapping as arg as well as struct page *page:
> > > because by this time page->mapping has been reset to NULL.
> > > 
> > > But I'm not at all keen on adding a calllback in this very special
> > > frozen-to-0-references place: please let's not do it without an okay
> > > from Nick Piggin (now Cc'ed).
> > > 
> > > I agree completely with what Linus said originally about how the
> > > page cannot be freed while there's a reference to it, and it should
> > > be possible to work this without your additional page locks.
> > > 
> > > Your ->releasepage should be able to judge whether the page is likely
> > > (not certain) to be freed - page_count 3? 1 for the page cache, 1 for
> > > the page_private reference, 1 for vmscan's reference, I think.  Then
> > > it can mark !PageUptodate and proceed with freeing the stuff you had
> > > allocated, undo page_has_private and its reference, and return 1 (or
> > > return 0 if it decides to hold on to the page).
> > 
> > That is very brittle. I'd prefer not to have to scan linux-mm every week
> > in order to find out if someone changed the page_count.
> > 
> > However, while reading Documentation/filesystems/vfs.txt (in order to
> > add documentation for freepage) I was surprised to read that the
> > ->releasepage() is itself supposed to be allowed to actually remove the
> > page from the address space if it so desires.
> 
> That doesn't sound right.  It came from Neil in 2006.
> 
> Neil, what were you thinking there?   Did you find such a ->releasepage()?

Nope, no idea, sorry.

No releasepage functions do anything like that, and no call sites suggest it
could be a possibility.  Quite the reverse - they are likely to remove the
page from the mapping without checking that it is still in the mapping.

So that sentence should be deleted.

(Getting good review for doco updates is sooo hard :-)

NeilBrown


  reply	other threads:[~2010-12-01 21:29 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-30 17:42 [PATCH] NFS: Fix a readdirplus bug Trond Myklebust
2010-11-30 22:10 ` Linus Torvalds
2010-11-30 22:13   ` Trond Myklebust
2010-12-01  3:47   ` [PATCH 0/3] Fix more NFS readdir regressions Trond Myklebust
2010-12-01  3:47   ` [PATCH 1/3] NFS: Ensure we use the correct cookie in nfs_readdir_xdr_filler Trond Myklebust
2010-12-01 15:04     ` Nick Bowler
2010-12-01 15:36       ` [PATCH v2 0/3] Fix more NFS readdir regressions Trond Myklebust
2010-12-01 15:36       ` [PATCH v2 1/3] NFS: Ensure we use the correct cookie in nfs_readdir_xdr_filler Trond Myklebust
2010-12-01 15:36       ` [PATCH v2 2/3] NFS: lock the readdir page while it is in use Trond Myklebust
2010-12-01 15:36       ` [PATCH v2 3/3] NFS: Fix a memory leak in nfs_readdir Trond Myklebust
2010-12-01 16:17         ` Linus Torvalds
2010-12-01 16:35           ` Rik van Riel
2010-12-01 16:45             ` Benny Halevy
2010-12-01 16:47             ` Linus Torvalds
2010-12-01 17:02               ` Rik van Riel
2010-12-01 17:58           ` Trond Myklebust
2010-12-01 18:29           ` Miklos Szeredi
2010-12-01 18:54           ` Trond Myklebust
2010-12-01 19:23             ` Hugh Dickins
2010-12-01 19:52               ` Linus Torvalds
2010-12-01 20:05               ` Trond Myklebust
2010-12-01 20:39                 ` Andrew Morton
2010-12-01 21:29                   ` Neil Brown [this message]
2010-12-01 22:43                     ` Andrew Morton
2010-12-01 23:01                       ` Neil Brown
2010-12-01 19:47             ` Linus Torvalds
2010-12-01 20:10               ` Trond Myklebust
2010-12-01 20:10                 ` Trond Myklebust
2010-12-01 20:18                 ` Linus Torvalds
2010-12-01 20:31                 ` Hugh Dickins
2010-12-01 20:33                 ` Andrew Morton
2010-12-01 21:02                   ` Hugh Dickins
2010-12-01 21:15                     ` Hugh Dickins
2010-12-01 21:38                       ` Andrew Morton
2010-12-01 21:51                         ` Trond Myklebust
2010-12-01 22:13                           ` Andrew Morton
2010-12-01 22:24                             ` Linus Torvalds
2010-12-01 22:38                               ` Andrew Morton
2010-12-01 22:47                                 ` Trond Myklebust
2010-12-01 23:21                                   ` Trond Myklebust
2010-12-01 23:21                                     ` Trond Myklebust
2010-12-01 23:46                                     ` Andrew Morton
2010-12-01 23:56                                       ` Trond Myklebust
2010-12-01 23:31                                 ` Linus Torvalds
2010-12-01 23:36                                   ` Andrew Morton
2010-12-02  1:05                                     ` Linus Torvalds
2010-12-02  1:22                                       ` Andrew Morton
2010-12-02  1:42                                         ` Linus Torvalds
2010-12-02  2:05                                           ` Andrew Morton
2010-12-02  3:08                                           ` [PATCH v3 0/3] Fix more NFS readdir regressions Trond Myklebust
2010-12-02  3:08                                           ` [PATCH v3 1/3] NFS: Ensure we use the correct cookie in nfs_readdir_xdr_filler Trond Myklebust
2010-12-02  3:08                                           ` [PATCH v3 2/3] Call the filesystem back whenever a page is removed from the page cache Trond Myklebust
2010-12-02  3:34                                             ` Hugh Dickins
2010-12-02  3:53                                               ` Trond Myklebust
2010-12-02  3:58                                                 ` Linus Torvalds
2010-12-06 16:59                                                   ` [PATCH v4 0/3] Fix more NFS readdir regressions Trond Myklebust
2010-12-06 16:59                                                   ` [PATCH v4 1/3] NFS: Ensure we use the correct cookie in nfs_readdir_xdr_filler Trond Myklebust
2010-12-06 16:59                                                   ` [PATCH v4 2/3] Call the filesystem back whenever a page is removed from the page cache Trond Myklebust
2010-12-07  7:08                                                     ` Nick Piggin
2010-12-06 16:59                                                   ` [PATCH v4 3/3] NFS: Fix a memory leak in nfs_readdir Trond Myklebust
2010-12-02  3:08                                           ` [PATCH v3 " Trond Myklebust
2010-12-03  9:12                                           ` [PATCH v2 " Nick Piggin
2010-12-01 23:43                                   ` Trond Myklebust
2010-12-01 22:43                               ` Hugh Dickins
2010-12-01  3:47   ` [PATCH 2/3] NFS: lock the readdir page while it is in use Trond Myklebust
2010-12-01  4:10     ` Linus Torvalds
2010-12-01  4:29       ` Trond Myklebust
2010-12-01  5:06         ` Linus Torvalds
2010-12-01 14:49           ` Trond Myklebust
2010-12-01 13:14         ` Rik van Riel
2010-12-01 14:55           ` Trond Myklebust
2010-12-01  3:47   ` [PATCH 3/3] NFS: Fix a memory leak in nfs_readdir Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101202082913.7cb98444@notabene.brown \
    --to=neilb@suse.de \
    --cc=Trond.Myklebust@netapp.com \
    --cc=akpm@linux-foundation.org \
    --cc=hch@lst.de \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=nbowler@elliptictech.com \
    --cc=npiggin@kernel.dk \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.