All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Prevent lookup from finding bad buffers
@ 2009-02-10  2:48 Lachlan McIlroy
  2009-02-15 20:12 ` Christoph Hellwig
  2009-11-25 20:26 ` Eric Sandeen
  0 siblings, 2 replies; 4+ messages in thread
From: Lachlan McIlroy @ 2009-02-10  2:48 UTC (permalink / raw)
  To: xfs-oss

There's a bug in _xfs_buf_find() that will cause it to return buffers
that failed to be initialised.

If a thread has a buffer locked and is waiting for I/O to initialise
it and another thread wants the same buffer the second thread will
wait on the buffer lock in _xfs_buf_find().  If the initial thread
gets an I/O error it marks the buffer in error and releases the
buffer lock.  The second thread gets the buffer lock, assumes the
buffer has been successfully initialised, and then tries to use it.

Some callers of xfs_buf_get_flags() will check for B_DONE, and if
it's not set then re-issue the I/O, bust most callers assume the
buffer and it's contents are good and then use the uninitialised
data.

The solution I've come up with is if we lookup a buffer and find
it's got b_error set or has been marked stale then unhash it from
the buffer hashtable and retry the lookup.  Also if we fail to setup
the buffer correctly in xfs_buf_get_flags() then mark the buffer in
error and unhash it.  If the buffer is marked stale then in
xfs_buf_free() inform the page cache that the contents of the pages
are no longer valid.

Index: xfs-patch/fs/xfs/linux-2.6/xfs_buf.c
===================================================================
--- xfs-patch.orig/fs/xfs/linux-2.6/xfs_buf.c
+++ xfs-patch/fs/xfs/linux-2.6/xfs_buf.c
@@ -271,6 +271,8 @@ xfs_buf_free(

  			if (bp->b_flags & _XBF_PAGE_CACHE)
  				ASSERT(!PagePrivate(page));
+			if (XFS_BUF_ISSTALE(bp))
+				ClearPageUptodate(page);
  			page_cache_release(page);
  		}
  		_xfs_buf_free_pages(bp);
@@ -430,7 +432,7 @@ _xfs_buf_find(
  	ASSERT(!(range_base & (xfs_off_t)btp->bt_smask));

  	hash = &btp->bt_hash[hash_long((unsigned long)ioff, btp->bt_hashshift)];
-
+retry:
  	spin_lock(&hash->bh_lock);

  	list_for_each_entry_safe(bp, n, &hash->bh_list, b_hash_list) {
@@ -489,9 +491,12 @@ found:
  		XB_SET_OWNER(bp);
  	}

-	if (bp->b_flags & XBF_STALE) {
+	if (bp->b_error || XFS_BUF_ISSTALE(bp)) {
  		ASSERT((bp->b_flags & _XBF_DELWRI_Q) == 0);
-		bp->b_flags &= XBF_MAPPED;
+		xfs_buf_unhash(bp);
+		xfs_buf_unlock(bp);
+		xfs_buf_rele(bp);
+		goto retry;
  	}
  	XB_TRACE(bp, "got_lock", 0);
  	XFS_STATS_INC(xb_get_locked);
@@ -553,6 +558,10 @@ xfs_buf_get_flags(
  	return bp;

   no_buffer:
+	if (error) {
+		xfs_buf_ioerror(bp, -error);
+		xfs_buf_unhash(bp);
+	}
  	if (flags & (XBF_LOCK | XBF_TRYLOCK))
  		xfs_buf_unlock(bp);
  	xfs_buf_rele(bp);
@@ -771,6 +780,20 @@ xfs_buf_hold(
  	XB_TRACE(bp, "hold", 0);
  }

+void
+xfs_buf_unhash(
+	xfs_buf_t		*bp)
+{
+	xfs_bufhash_t		*hash = bp->b_hash;
+
+	if (hash) {
+		spin_lock(&hash->bh_lock);
+		bp->b_hash = 0;
+		list_del_init(&bp->b_hash_list);
+		spin_unlock(&hash->bh_lock);
+	}
+}
+
  /*
   *	Releases a hold on the specified buffer.  If the
   *	the hold count is 1, calls xfs_buf_free.
Index: xfs-patch/fs/xfs/linux-2.6/xfs_buf.h
===================================================================
--- xfs-patch.orig/fs/xfs/linux-2.6/xfs_buf.h
+++ xfs-patch/fs/xfs/linux-2.6/xfs_buf.h
@@ -206,6 +206,7 @@ extern void xfs_buf_readahead(xfs_buftar
  /* Releasing Buffers */
  extern void xfs_buf_free(xfs_buf_t *);
  extern void xfs_buf_rele(xfs_buf_t *);
+extern void xfs_buf_unhash(xfs_buf_t *);

  /* Locking and Unlocking Buffers */
  extern int xfs_buf_cond_lock(xfs_buf_t *);

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Prevent lookup from finding bad buffers
  2009-02-10  2:48 [PATCH] Prevent lookup from finding bad buffers Lachlan McIlroy
@ 2009-02-15 20:12 ` Christoph Hellwig
  2009-11-25 20:26 ` Eric Sandeen
  1 sibling, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2009-02-15 20:12 UTC (permalink / raw)
  To: Lachlan McIlroy; +Cc: xfs-oss

On Tue, Feb 10, 2009 at 01:48:25PM +1100, Lachlan McIlroy wrote:
> There's a bug in _xfs_buf_find() that will cause it to return buffers
> that failed to be initialised.

Is there a testcase for this?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Prevent lookup from finding bad buffers
  2009-02-10  2:48 [PATCH] Prevent lookup from finding bad buffers Lachlan McIlroy
  2009-02-15 20:12 ` Christoph Hellwig
@ 2009-11-25 20:26 ` Eric Sandeen
  2009-11-25 22:29   ` Eric Sandeen
  1 sibling, 1 reply; 4+ messages in thread
From: Eric Sandeen @ 2009-11-25 20:26 UTC (permalink / raw)
  To: lachlan; +Cc: xfs-oss

Lachlan McIlroy wrote:
> There's a bug in _xfs_buf_find() that will cause it to return buffers
> that failed to be initialised.
> 
> If a thread has a buffer locked and is waiting for I/O to initialise
> it and another thread wants the same buffer the second thread will
> wait on the buffer lock in _xfs_buf_find().  If the initial thread
> gets an I/O error it marks the buffer in error and releases the
> buffer lock.  The second thread gets the buffer lock, assumes the
> buffer has been successfully initialised, and then tries to use it.
> 
> Some callers of xfs_buf_get_flags() will check for B_DONE, and if
> it's not set then re-issue the I/O, bust most callers assume the
> buffer and it's contents are good and then use the uninitialised
> data.
> 
> The solution I've come up with is if we lookup a buffer and find
> it's got b_error set or has been marked stale then unhash it from
> the buffer hashtable and retry the lookup.  Also if we fail to setup
> the buffer correctly in xfs_buf_get_flags() then mark the buffer in
> error and unhash it.  If the buffer is marked stale then in
> xfs_buf_free() inform the page cache that the contents of the pages
> are no longer valid.

I managed to come up with a sorta-kinda testcase for this.

Fragmented freespace, many files in a dir, on raid5; simply doing
drop caches / ls in a loop triggered it.

I guess raid5 is bad in this respect; in it's make_request() we have:

                } else {
                        /* cannot get stripe for read-ahead, just give-up */
                        clear_bit(BIO_UPTODATE, &bi->bi_flags);
                        finish_wait(&conf->wait_for_overlap, &w);
                        break;
                }

and this happens fairly often.  This probably explains a large
percentage of our xfs_da_do_buf(2) errors we've seen on the list.

>From my testing, I think this suffices - and interestingly, Lachlan's
original patch doesn't seem to help...

Comments?

Maybe could clean up the logic a bit... should this only be
tested for XBF_READ buffers as well ... or maybe an assert that
if !uptodate, error should be set ...

diff --git a/fs/xfs/linux-2.6/xfs_buf.c b/fs/xfs/linux-2.6/xfs_buf.c
index 965df12..cbc0541 100644
--- a/fs/xfs/linux-2.6/xfs_buf.c
+++ b/fs/xfs/linux-2.6/xfs_buf.c
@@ -1142,6 +1165,8 @@ xfs_buf_bio_end_io(
 		if (unlikely(bp->b_error)) {
 			if (bp->b_flags & XBF_READ)
 				ClearPageUptodate(page);
+		} else if (!test_bit(BIO_UPTODATE, &bio->bi_flags)) {
+			ClearPageUptodate(bp);
 		} else if (blocksize >= PAGE_CACHE_SIZE) {
 			SetPageUptodate(page);
 		} else if (!PagePrivate(page) &&

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] Prevent lookup from finding bad buffers
  2009-11-25 20:26 ` Eric Sandeen
@ 2009-11-25 22:29   ` Eric Sandeen
  0 siblings, 0 replies; 4+ messages in thread
From: Eric Sandeen @ 2009-11-25 22:29 UTC (permalink / raw)
  To: lachlan; +Cc: xfs-oss

Eric Sandeen wrote:
> Lachlan McIlroy wrote:
>> There's a bug in _xfs_buf_find() that will cause it to return buffers
>> that failed to be initialised.
>>
>> If a thread has a buffer locked and is waiting for I/O to initialise
>> it and another thread wants the same buffer the second thread will
>> wait on the buffer lock in _xfs_buf_find().  If the initial thread
>> gets an I/O error it marks the buffer in error and releases the
>> buffer lock.  The second thread gets the buffer lock, assumes the
>> buffer has been successfully initialised, and then tries to use it.
>>
>> Some callers of xfs_buf_get_flags() will check for B_DONE, and if
>> it's not set then re-issue the I/O, bust most callers assume the
>> buffer and it's contents are good and then use the uninitialised
>> data.
>>
>> The solution I've come up with is if we lookup a buffer and find
>> it's got b_error set or has been marked stale then unhash it from
>> the buffer hashtable and retry the lookup.  Also if we fail to setup
>> the buffer correctly in xfs_buf_get_flags() then mark the buffer in
>> error and unhash it.  If the buffer is marked stale then in
>> xfs_buf_free() inform the page cache that the contents of the pages
>> are no longer valid.
> 
> I managed to come up with a sorta-kinda testcase for this.
> 
> Fragmented freespace, many files in a dir, on raid5; simply doing
> drop caches / ls in a loop triggered it.
> 
> I guess raid5 is bad in this respect; in it's make_request() we have:
> 
>                 } else {
>                         /* cannot get stripe for read-ahead, just give-up */
>                         clear_bit(BIO_UPTODATE, &bi->bi_flags);
>                         finish_wait(&conf->wait_for_overlap, &w);
>                         break;
>                 }
> 
> and this happens fairly often.  This probably explains a large
> percentage of our xfs_da_do_buf(2) errors we've seen on the list.
> 
> From my testing, I think this suffices - and interestingly, Lachlan's
> original patch doesn't seem to help...
> 
> Comments?
> 
> Maybe could clean up the logic a bit... should this only be
> tested for XBF_READ buffers as well ... or maybe an assert that
> if !uptodate, error should be set ...
> 
> diff --git a/fs/xfs/linux-2.6/xfs_buf.c b/fs/xfs/linux-2.6/xfs_buf.c
> index 965df12..cbc0541 100644
> --- a/fs/xfs/linux-2.6/xfs_buf.c
> +++ b/fs/xfs/linux-2.6/xfs_buf.c
> @@ -1142,6 +1165,8 @@ xfs_buf_bio_end_io(
>  		if (unlikely(bp->b_error)) {
>  			if (bp->b_flags & XBF_READ)
>  				ClearPageUptodate(page);
> +		} else if (!test_bit(BIO_UPTODATE, &bio->bi_flags)) {
> +			ClearPageUptodate(bp);
>  		} else if (blocksize >= PAGE_CACHE_SIZE) {
>  			SetPageUptodate(page);
>  		} else if (!PagePrivate(page) &&

Ok, so that was shoot-from-the-hip, and actually it was tested on an
older kernel; upstream didn't demonstrate the problem, thanks to:

commit c2b00852fbae4f8c45c2651530ded3bd01bde814
Author: NeilBrown <neilb@suse.de>
Date:   Sun Dec 10 02:20:51 2006 -0800

[PATCH] md: return a non-zero error to bi_end_io as appropriate in raid5

Currently raid5 depends on clearing the BIO_UPTODATE flag to signal an
error
to higher levels.  While this should be sufficient, it is safer to
explicitly
set the error code as well - less room for confusion.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

As Neil says, xfs should probably cope too by checking uptodate itself
as well, though...

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-11-25 22:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-10  2:48 [PATCH] Prevent lookup from finding bad buffers Lachlan McIlroy
2009-02-15 20:12 ` Christoph Hellwig
2009-11-25 20:26 ` Eric Sandeen
2009-11-25 22:29   ` Eric Sandeen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.