linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Fix race between callers of read_cache_page_async and invalidate_inode_pages.
@ 2009-04-27  5:20 Neil Brown
  2009-04-27  5:37 ` Andrew Morton
  0 siblings, 1 reply; 3+ messages in thread
From: Neil Brown @ 2009-04-27  5:20 UTC (permalink / raw)
  To: linux-kernel, linux-mm; +Cc: Andrew Morton, Nick Piggin, David Woodhouse




Callers of read_cache_page_async typically wait for the page to become
unlocked (wait_on_page_locked) and then test PageUptodate to see if
the read was successful, or if there was an error.

This is wrong.

invalidate_inode_pages can cause an unlocked page to lose its
PageUptodate flag at any time without implying a read error.

As any read error will cause PageError to be set, it is much safer,
and more idiomatic to test "PageError" than to test "!PageUptodate".
Hence this patch.

An actual failure that has been seen (on a 2.6.5 based kernel)
involves symlinks.  Symlinks are more suseptible as the 'open' and
'read' phases can be very close together, and so can both overlap with
invalidate_inode_pages.

The sequence goes something like:

  high memory pressure prunes dentry
  continuing memory pressure cause prune
    of inode to start.  Start invaliding
    page(s).
                                             Lookup of path containing symlink
                                              causes inode (inode is found in
                                              cache).
                                             page_getlink calls
                                               read_cache_page
                                                read_cache_page_async
                                                finds that page is Uptodate
   __invalidate_mapping_pages finds page
   and locks it
                                                read_cache_page waits for lock
                                                to be released.
   invalidate_complete_page clears
   PageUptodate
                                                read_cache_page finds Uptodate
                                                is clear and assumes an error.

As we can see, finding !PageUptodate is not an error.  Possibly in
this case we could try an read again, but really there is no point.
After calling read_cache_page_async and waiting for the page to be
unlocked, then either the page has been read, or there was an error.
The simplest way to check, is to tests PageError.

Note the "typically" in the first sentence refers to fs/jffs2/fs.c
which uses read_cache_page_async, but never checks for an error, or
even waits for the page to be unlocked.  This seems wrong, though
maybe there is some justification for it.

Signed-off-by: NeilBrown <neilb@suse.de>
cc: Nick Piggin <npiggin@suse.de>
Cc: David Woodhouse <dwmw2@infradead.org>
---
 fs/cramfs/inode.c |    2 +-
 mm/filemap.c      |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index dd3634e..573d582 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -180,7 +180,7 @@ static void *cramfs_read(struct super_block *sb, unsigned int offset, unsigned i
 		struct page *page = pages[i];
 		if (page) {
 			wait_on_page_locked(page);
-			if (!PageUptodate(page)) {
+			if (PageError(page)) {
 				/* asynchronous error */
 				page_cache_release(page);
 				pages[i] = NULL;
diff --git a/mm/filemap.c b/mm/filemap.c
index 379ff0b..9ff8093 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1770,7 +1770,7 @@ struct page *read_cache_page(struct address_space *mapping,
 	if (IS_ERR(page))
 		goto out;
 	wait_on_page_locked(page);
-	if (!PageUptodate(page)) {
+	if (!PageError(page)) {
 		page_cache_release(page);
 		page = ERR_PTR(-EIO);
 	}
-- 
1.6.2.4


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] Fix race between callers of read_cache_page_async and invalidate_inode_pages.
  2009-04-27  5:20 [PATCH] Fix race between callers of read_cache_page_async and invalidate_inode_pages Neil Brown
@ 2009-04-27  5:37 ` Andrew Morton
  2009-04-27  7:46   ` Neil Brown
  0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2009-04-27  5:37 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-kernel, linux-mm, Nick Piggin, David Woodhouse

On Mon, 27 Apr 2009 15:20:22 +1000 Neil Brown <neilb@suse.de> wrote:

> 
> 
> 
> Callers of read_cache_page_async typically wait for the page to become
> unlocked (wait_on_page_locked) and then test PageUptodate to see if
> the read was successful, or if there was an error.
> 
> This is wrong.
> 
> invalidate_inode_pages can cause an unlocked page to lose its
> PageUptodate flag at any time without implying a read error.

ow.

> As any read error will cause PageError to be set, it is much safer,
> and more idiomatic to test "PageError" than to test "!PageUptodate".
> 
> ...
>
> diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
> index dd3634e..573d582 100644
> --- a/fs/cramfs/inode.c
> +++ b/fs/cramfs/inode.c
> @@ -180,7 +180,7 @@ static void *cramfs_read(struct super_block *sb, unsigned int offset, unsigned i
>  		struct page *page = pages[i];
>  		if (page) {
>  			wait_on_page_locked(page);
> -			if (!PageUptodate(page)) {
> +			if (PageError(page)) {
>  				/* asynchronous error */
>  				page_cache_release(page);
>  				pages[i] = NULL;
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 379ff0b..9ff8093 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -1770,7 +1770,7 @@ struct page *read_cache_page(struct address_space *mapping,
>  	if (IS_ERR(page))
>  		goto out;
>  	wait_on_page_locked(page);
> -	if (!PageUptodate(page)) {
> +	if (!PageError(page)) {
>  		page_cache_release(page);
>  		page = ERR_PTR(-EIO);
>  	}

hrm.  And where is it written that PageError() will remain inviolable
after it has been set?

A safer and more formal (albeit somewhat slower) fix would be to lock
the page and check its state under the lock.

y:/usr/src/linux-2.6.30-rc3> grep -r ClearPageError . | wc -l
21

?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] Fix race between callers of read_cache_page_async and invalidate_inode_pages.
  2009-04-27  5:37 ` Andrew Morton
@ 2009-04-27  7:46   ` Neil Brown
  0 siblings, 0 replies; 3+ messages in thread
From: Neil Brown @ 2009-04-27  7:46 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-mm, Nick Piggin, David Woodhouse

On Sunday April 26, akpm@linux-foundation.org wrote:
> On Mon, 27 Apr 2009 15:20:22 +1000 Neil Brown <neilb@suse.de> wrote:
> 
> hrm.  And where is it written that PageError() will remain inviolable
> after it has been set?

  ...it follows as night the day....

What use would PageError be if it can just disappear when you most
want to test it?
Then again, what use is PageUptodate if it can just disappear?  My
other thought for fixing this was to change truncate_complete_page to
not clear PageUptodate.....
Oh.  That's already been done in 2.6.27-rc2.

So I guess this isn't a bug in mainline anymore... sorry for the noise :-)
(I'll just go quietly fix some enterprise kernels).
> 
> A safer and more formal (albeit somewhat slower) fix would be to lock
> the page and check its state under the lock.
> 
> y:/usr/src/linux-2.6.30-rc3> grep -r ClearPageError . | wc -l
> 21

I think each of these do one of:
   - clear the error after a successful read
   - clear the error before a read attempt
   - clear the error before a write
all (I think) while the page is locked.  None of these would
invalidate the change I made. (and I still think that it would read
better to say
   if (PageError(page))
     goto error;

than
   if (!PageUptodate(page))
     goto error;

but no matter).

Thanks anyway.
NeilBrown

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-04-27  7:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-27  5:20 [PATCH] Fix race between callers of read_cache_page_async and invalidate_inode_pages Neil Brown
2009-04-27  5:37 ` Andrew Morton
2009-04-27  7:46   ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).