[PATCH] Btrfs: fix crash on endio of reading corrupted block

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] Btrfs: fix crash on endio of reading corrupted block
@ 2014-08-19 15:33 Liu Bo
  2014-08-19 19:49 ` Chris Mason
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Liu Bo @ 2014-08-19 15:33 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Chris Murphy

The crash is

------------[ cut here ]------------
kernel BUG at fs/btrfs/extent_io.c:2124!
[...]
Workqueue: btrfs-endio normal_work_helper [btrfs]
RIP: 0010:[<ffffffffa02d6055>]  [<ffffffffa02d6055>] end_bio_extent_readpage+0xb45/0xcd0 [btrfs]

This is in fact a regression.

It is because we forgot to increase @offset properly in reading corrupted block,
so that the @offset remains, and this leads to checksum errors while reading
left blocks queued up in the same bio, and then ends up with hiting the above
BUG_ON.

Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/extent_io.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 3af4966..be41e4d 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2602,6 +2602,7 @@ static void end_bio_extent_readpage(struct bio *bio, int err)
 					test_bit(BIO_UPTODATE, &bio->bi_flags);
 				if (err)
 					uptodate = 0;
+				offset += len;
 				continue;
 			}
 		}
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] Btrfs: fix crash on endio of reading corrupted block
  2014-08-19 15:33 [PATCH] Btrfs: fix crash on endio of reading corrupted block Liu Bo
@ 2014-08-19 19:49 ` Chris Mason
  2014-08-19 21:42 ` Eric Sandeen
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Chris Mason @ 2014-08-19 19:49 UTC (permalink / raw)
  To: Liu Bo, linux-btrfs; +Cc: Chris Murphy



On 08/19/2014 11:33 AM, Liu Bo wrote:
> The crash is
> 
> ------------[ cut here ]------------
> kernel BUG at fs/btrfs/extent_io.c:2124!
> [...]
> Workqueue: btrfs-endio normal_work_helper [btrfs]
> RIP: 0010:[<ffffffffa02d6055>]  [<ffffffffa02d6055>] end_bio_extent_readpage+0xb45/0xcd0 [btrfs]
> 
> This is in fact a regression.
> 
> It is because we forgot to increase @offset properly in reading corrupted block,
> so that the @offset remains, and this leads to checksum errors while reading
> left blocks queued up in the same bio, and then ends up with hiting the above
> BUG_ON.

Thanks Chris and Liu, this is queued.

-chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Btrfs: fix crash on endio of reading corrupted block
  2014-08-19 15:33 [PATCH] Btrfs: fix crash on endio of reading corrupted block Liu Bo
  2014-08-19 19:49 ` Chris Mason
@ 2014-08-19 21:42 ` Eric Sandeen
  2014-08-20  8:20   ` Liu Bo
  2014-08-20  8:51 ` [PATCH v2] " Liu Bo
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Eric Sandeen @ 2014-08-19 21:42 UTC (permalink / raw)
  To: Liu Bo, linux-btrfs; +Cc: Chris Murphy

On 8/19/14, 10:33 AM, Liu Bo wrote:
> The crash is
> 
> ------------[ cut here ]------------
> kernel BUG at fs/btrfs/extent_io.c:2124!
> [...]
> Workqueue: btrfs-endio normal_work_helper [btrfs]
> RIP: 0010:[<ffffffffa02d6055>]  [<ffffffffa02d6055>] end_bio_extent_readpage+0xb45/0xcd0 [btrfs]
> 
> This is in fact a regression.

It'd be helpful to identify the commit, or at least kernel release, which caused
the regression.

> It is because we forgot to increase @offset properly in reading corrupted block,
> so that the @offset remains, and this leads to checksum errors while reading
> left blocks queued up in the same bio, and then ends up with hiting the above
> BUG_ON.

So does that mean that any checksum error on this path will crash the kernel?

That sounds like this bug has exposed a more fundamental problem, no?

Thanks,
-Eric

> Reported-by: Chris Murphy <lists@colorremedies.com>
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> ---
>  fs/btrfs/extent_io.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 3af4966..be41e4d 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -2602,6 +2602,7 @@ static void end_bio_extent_readpage(struct bio *bio, int err)
>  					test_bit(BIO_UPTODATE, &bio->bi_flags);
>  				if (err)
>  					uptodate = 0;
> +				offset += len;
>  				continue;
>  			}
>  		}
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Btrfs: fix crash on endio of reading corrupted block
  2014-08-19 21:42 ` Eric Sandeen
@ 2014-08-20  8:20   ` Liu Bo
  0 siblings, 0 replies; 10+ messages in thread
From: Liu Bo @ 2014-08-20  8:20 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-btrfs, Chris Murphy

On Tue, Aug 19, 2014 at 04:42:42PM -0500, Eric Sandeen wrote:
> On 8/19/14, 10:33 AM, Liu Bo wrote:
> > The crash is
> > 
> > ------------[ cut here ]------------
> > kernel BUG at fs/btrfs/extent_io.c:2124!
> > [...]
> > Workqueue: btrfs-endio normal_work_helper [btrfs]
> > RIP: 0010:[<ffffffffa02d6055>]  [<ffffffffa02d6055>] end_bio_extent_readpage+0xb45/0xcd0 [btrfs]
> > 
> > This is in fact a regression.
> 
> It'd be helpful to identify the commit, or at least kernel release, which caused
> the regression.

Okay, got it.

> 
> > It is because we forgot to increase @offset properly in reading corrupted block,
> > so that the @offset remains, and this leads to checksum errors while reading
> > left blocks queued up in the same bio, and then ends up with hiting the above
> > BUG_ON.
> 
> So does that mean that any checksum error on this path will crash the kernel?
> 
> That sounds like this bug has exposed a more fundamental problem, no?

Eric, you're right, I was hiding some details, now writing a new commit log...

thanks,
-liubo

> 
> Thanks,
> -Eric
> 
> > Reported-by: Chris Murphy <lists@colorremedies.com>
> > Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> > ---
> >  fs/btrfs/extent_io.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> > index 3af4966..be41e4d 100644
> > --- a/fs/btrfs/extent_io.c
> > +++ b/fs/btrfs/extent_io.c
> > @@ -2602,6 +2602,7 @@ static void end_bio_extent_readpage(struct bio *bio, int err)
> >  					test_bit(BIO_UPTODATE, &bio->bi_flags);
> >  				if (err)
> >  					uptodate = 0;
> > +				offset += len;
> >  				continue;
> >  			}
> >  		}
> > 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2] Btrfs: fix crash on endio of reading corrupted block
  2014-08-19 15:33 [PATCH] Btrfs: fix crash on endio of reading corrupted block Liu Bo
  2014-08-19 19:49 ` Chris Mason
  2014-08-19 21:42 ` Eric Sandeen
@ 2014-08-20  8:51 ` Liu Bo
  2014-08-22 15:32   ` Eric Sandeen
  2014-08-23  3:59 ` Liu Bo
  2014-08-23  4:00 ` [PATCH v3] " Liu Bo
  4 siblings, 1 reply; 10+ messages in thread
From: Liu Bo @ 2014-08-20  8:51 UTC (permalink / raw)
  To: linux-btrfs

The crash is

------------[ cut here ]------------
kernel BUG at fs/btrfs/extent_io.c:2124!
invalid opcode: 0000 [#1] SMP
...
CPU: 3 PID: 88 Comm: kworker/u8:7 Not tainted 3.17.0-0.rc1.git0.1.fc22.x86_64 #1
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Workqueue: btrfs-endio normal_work_helper [btrfs]
task: ffff8800d7152700 ti: ffff8800d729c000 task.ti: ffff8800d729c000
RIP: 0010:[<ffffffffa02d6055>]  [<ffffffffa02d6055>] end_bio_extent_readpage+0xb45/0xcd0 [btrfs]
Call Trace:
  [<ffffffff810c3ef8>] ? __enqueue_entity+0x78/0x80
  [<ffffffff810ca969>] ? enqueue_entity+0x2e9/0x990
  [<ffffffff813464ab>] bio_endio+0x6b/0xa0
  [<ffffffff813464f2>] bio_endio_nodec+0x12/0x20
  [<ffffffffa02ab217>] end_workqueue_fn+0x37/0x40 [btrfs]
  [<ffffffffa02e4b5d>] normal_work_helper+0xbd/0x280 [btrfs]
  [<ffffffff810ac4fe>] process_one_work+0x17e/0x430
  [<ffffffff810ace8b>] worker_thread+0x6b/0x4a0
  [<ffffffff810ace20>] ? rescuer_thread+0x2a0/0x2a0
  [<ffffffff810b1fca>] kthread+0xea/0x100
  [<ffffffff810b1ee0>] ? kthread_create_on_node+0x1a0/0x1a0
  [<ffffffff8173dd7c>] ret_from_fork+0x7c/0xb0
  [<ffffffff810b1ee0>] ? kthread_create_on_node+0x1a0/0x1a0

This is in fact a regression.

It is because we forgot to increase @offset properly in reading corrupted block,
so that the @offset remains unchanged, and it leads to checksum errors while
reading left blocks queued up in the same bio, and then btrfs tries to
iterate copies for those blocks in order to get good data, and hits the
BUG_ON() which we set to avoid finding good copies for blocks without problems.

Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
v2:
   - Improve the commit log to be clear, suggested by Eric.

 fs/btrfs/extent_io.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 3af4966..be41e4d 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2602,6 +2602,7 @@ static void end_bio_extent_readpage(struct bio *bio, int err)
 					test_bit(BIO_UPTODATE, &bio->bi_flags);
 				if (err)
 					uptodate = 0;
+				offset += len;
 				continue;
 			}
 		}
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] Btrfs: fix crash on endio of reading corrupted block
  2014-08-20  8:51 ` [PATCH v2] " Liu Bo
@ 2014-08-22 15:32   ` Eric Sandeen
  2014-08-23  3:53     ` Liu Bo
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Sandeen @ 2014-08-22 15:32 UTC (permalink / raw)
  To: Liu Bo, linux-btrfs

On 8/20/14, 3:51 AM, Liu Bo wrote:
> The crash is
> 
> ------------[ cut here ]------------
> kernel BUG at fs/btrfs/extent_io.c:2124!
> invalid opcode: 0000 [#1] SMP

...

> ---
> v2:
>    - Improve the commit log to be clear, suggested by Eric.

Well, I had specifically asked for it to include details on when
the regression occurred, but didn't get that...  ;)

If you state that it's a regression, people may start wondering
if their kernels are vulnerable, how far back into -stable it
should go, which distros should pick up the fix, etc.  If you
don't say when it regressed, we're all left wondering...

Thanks,

-Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] Btrfs: fix crash on endio of reading corrupted block
  2014-08-22 15:32   ` Eric Sandeen
@ 2014-08-23  3:53     ` Liu Bo
  0 siblings, 0 replies; 10+ messages in thread
From: Liu Bo @ 2014-08-23  3:53 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-btrfs

On Fri, Aug 22, 2014 at 10:32:13AM -0500, Eric Sandeen wrote:
> On 8/20/14, 3:51 AM, Liu Bo wrote:
> > The crash is
> > 
> > ------------[ cut here ]------------
> > kernel BUG at fs/btrfs/extent_io.c:2124!
> > invalid opcode: 0000 [#1] SMP
> 
> ...
> 
> > ---
> > v2:
> >    - Improve the commit log to be clear, suggested by Eric.
> 
> Well, I had specifically asked for it to include details on when
> the regression occurred, but didn't get that...  ;)
> 
> If you state that it's a regression, people may start wondering
> if their kernels are vulnerable, how far back into -stable it
> should go, which distros should pick up the fix, etc.  If you
> don't say when it regressed, we're all left wondering...

Oh yeah, now I get it :)

thanks,
-liubo

> 
> Thanks,
> 
> -Eric
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2] Btrfs: fix crash on endio of reading corrupted block
  2014-08-19 15:33 [PATCH] Btrfs: fix crash on endio of reading corrupted block Liu Bo
                   ` (2 preceding siblings ...)
  2014-08-20  8:51 ` [PATCH v2] " Liu Bo
@ 2014-08-23  3:59 ` Liu Bo
  2014-08-23  4:00 ` [PATCH v3] " Liu Bo
  4 siblings, 0 replies; 10+ messages in thread
From: Liu Bo @ 2014-08-23  3:59 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Eric Sandeen

The crash is

------------[ cut here ]------------
kernel BUG at fs/btrfs/extent_io.c:2124!
invalid opcode: 0000 [#1] SMP
...
CPU: 3 PID: 88 Comm: kworker/u8:7 Not tainted 3.17.0-0.rc1.git0.1.fc22.x86_64 #1
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Workqueue: btrfs-endio normal_work_helper [btrfs]
task: ffff8800d7152700 ti: ffff8800d729c000 task.ti: ffff8800d729c000
RIP: 0010:[<ffffffffa02d6055>]  [<ffffffffa02d6055>] end_bio_extent_readpage+0xb45/0xcd0 [btrfs]
Call Trace:
  [<ffffffff810c3ef8>] ? __enqueue_entity+0x78/0x80
  [<ffffffff810ca969>] ? enqueue_entity+0x2e9/0x990
  [<ffffffff813464ab>] bio_endio+0x6b/0xa0
  [<ffffffff813464f2>] bio_endio_nodec+0x12/0x20
  [<ffffffffa02ab217>] end_workqueue_fn+0x37/0x40 [btrfs]
  [<ffffffffa02e4b5d>] normal_work_helper+0xbd/0x280 [btrfs]
  [<ffffffff810ac4fe>] process_one_work+0x17e/0x430
  [<ffffffff810ace8b>] worker_thread+0x6b/0x4a0
  [<ffffffff810ace20>] ? rescuer_thread+0x2a0/0x2a0
  [<ffffffff810b1fca>] kthread+0xea/0x100
  [<ffffffff810b1ee0>] ? kthread_create_on_node+0x1a0/0x1a0
  [<ffffffff8173dd7c>] ret_from_fork+0x7c/0xb0
  [<ffffffff810b1ee0>] ? kthread_create_on_node+0x1a0/0x1a0

This is in fact a regression introduced by commit
facc8a2247340a9735fe8cc123c5da2102f5ef1b(Btrfs: don't cache the csum value into
the extent state tree).

It is because we forgot to increase @offset properly in reading corrupted block,
so that the @offset remains unchanged, and it leads to checksum errors while
reading left blocks queued up in the same bio, and then btrfs tries to
iterate copies for those blocks in order to get good data, and hits the
BUG_ON() which we set to avoid finding good copies for blocks without problems.

Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
v2:
   - Improve the commit log to be clear, suggested by Eric.
v3:
   - Show the commit that introduces this bug, I forgot to add this in the v2
     version.

 fs/btrfs/extent_io.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 3af4966..be41e4d 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2602,6 +2602,7 @@ static void end_bio_extent_readpage(struct bio *bio, int err)
 					test_bit(BIO_UPTODATE, &bio->bi_flags);
 				if (err)
 					uptodate = 0;
+				offset += len;
 				continue;
 			}
 		}
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3] Btrfs: fix crash on endio of reading corrupted block
  2014-08-19 15:33 [PATCH] Btrfs: fix crash on endio of reading corrupted block Liu Bo
                   ` (3 preceding siblings ...)
  2014-08-23  3:59 ` Liu Bo
@ 2014-08-23  4:00 ` Liu Bo
  2014-09-02 20:17   ` Chris Murphy
  4 siblings, 1 reply; 10+ messages in thread
From: Liu Bo @ 2014-08-23  4:00 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Eric Sandeen

The crash is

------------[ cut here ]------------
kernel BUG at fs/btrfs/extent_io.c:2124!
invalid opcode: 0000 [#1] SMP
...
CPU: 3 PID: 88 Comm: kworker/u8:7 Not tainted 3.17.0-0.rc1.git0.1.fc22.x86_64 #1
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Workqueue: btrfs-endio normal_work_helper [btrfs]
task: ffff8800d7152700 ti: ffff8800d729c000 task.ti: ffff8800d729c000
RIP: 0010:[<ffffffffa02d6055>]  [<ffffffffa02d6055>] end_bio_extent_readpage+0xb45/0xcd0 [btrfs]
Call Trace:
  [<ffffffff810c3ef8>] ? __enqueue_entity+0x78/0x80
  [<ffffffff810ca969>] ? enqueue_entity+0x2e9/0x990
  [<ffffffff813464ab>] bio_endio+0x6b/0xa0
  [<ffffffff813464f2>] bio_endio_nodec+0x12/0x20
  [<ffffffffa02ab217>] end_workqueue_fn+0x37/0x40 [btrfs]
  [<ffffffffa02e4b5d>] normal_work_helper+0xbd/0x280 [btrfs]
  [<ffffffff810ac4fe>] process_one_work+0x17e/0x430
  [<ffffffff810ace8b>] worker_thread+0x6b/0x4a0
  [<ffffffff810ace20>] ? rescuer_thread+0x2a0/0x2a0
  [<ffffffff810b1fca>] kthread+0xea/0x100
  [<ffffffff810b1ee0>] ? kthread_create_on_node+0x1a0/0x1a0
  [<ffffffff8173dd7c>] ret_from_fork+0x7c/0xb0
  [<ffffffff810b1ee0>] ? kthread_create_on_node+0x1a0/0x1a0

This is in fact a regression introduced by commit
facc8a2247340a9735fe8cc123c5da2102f5ef1b(Btrfs: don't cache the csum value into
the extent state tree).

It is because we forgot to increase @offset properly in reading corrupted block,
so that the @offset remains unchanged, and it leads to checksum errors while
reading left blocks queued up in the same bio, and then btrfs tries to
iterate copies for those blocks in order to get good data, and hits the
BUG_ON() which we set to avoid finding good copies for blocks without problems.

Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
v2:
   - Improve the commit log to be clear, suggested by Eric.
v3:
   - Show the commit that introduces this bug, I forgot to add this in the v2
     version.

 fs/btrfs/extent_io.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 3af4966..be41e4d 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2602,6 +2602,7 @@ static void end_bio_extent_readpage(struct bio *bio, int err)
 					test_bit(BIO_UPTODATE, &bio->bi_flags);
 				if (err)
 					uptodate = 0;
+				offset += len;
 				continue;
 			}
 		}
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3] Btrfs: fix crash on endio of reading corrupted block
  2014-08-23  4:00 ` [PATCH v3] " Liu Bo
@ 2014-09-02 20:17   ` Chris Murphy
  0 siblings, 0 replies; 10+ messages in thread
From: Chris Murphy @ 2014-09-02 20:17 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs, Eric Sandeen


On Aug 22, 2014, at 10:00 PM, Liu Bo <bo.li.liu@oracle.com> wrote:

> The crash is
> 
> ------------[ cut here ]------------
> kernel BUG at fs/btrfs/extent_io.c:2124!
> invalid opcode: 0000 [#1] SMP
> ...
> CPU: 3 PID: 88 Comm: kworker/u8:7 Not tainted 3.17.0-0.rc1.git0.1.fc22.x86_64 #1
> Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
> Workqueue: btrfs-endio normal_work_helper [btrfs]
> task: ffff8800d7152700 ti: ffff8800d729c000 task.ti: ffff8800d729c000
> RIP: 0010:[<ffffffffa02d6055>]  [<ffffffffa02d6055>] end_bio_extent_readpage+0xb45/0xcd0 [btrfs]
> Call Trace:
>  [<ffffffff810c3ef8>] ? __enqueue_entity+0x78/0x80
>  [<ffffffff810ca969>] ? enqueue_entity+0x2e9/0x990
>  [<ffffffff813464ab>] bio_endio+0x6b/0xa0
>  [<ffffffff813464f2>] bio_endio_nodec+0x12/0x20
>  [<ffffffffa02ab217>] end_workqueue_fn+0x37/0x40 [btrfs]
>  [<ffffffffa02e4b5d>] normal_work_helper+0xbd/0x280 [btrfs]
>  [<ffffffff810ac4fe>] process_one_work+0x17e/0x430
>  [<ffffffff810ace8b>] worker_thread+0x6b/0x4a0
>  [<ffffffff810ace20>] ? rescuer_thread+0x2a0/0x2a0
>  [<ffffffff810b1fca>] kthread+0xea/0x100
>  [<ffffffff810b1ee0>] ? kthread_create_on_node+0x1a0/0x1a0
>  [<ffffffff8173dd7c>] ret_from_fork+0x7c/0xb0
>  [<ffffffff810b1ee0>] ? kthread_create_on_node+0x1a0/0x1a0
> 
> This is in fact a regression introduced by commit
> facc8a2247340a9735fe8cc123c5da2102f5ef1b(Btrfs: don't cache the csum value into
> the extent state tree).
> 
> It is because we forgot to increase @offset properly in reading corrupted block,
> so that the @offset remains unchanged, and it leads to checksum errors while
> reading left blocks queued up in the same bio, and then btrfs tries to
> iterate copies for those blocks in order to get good data, and hits the
> BUG_ON() which we set to avoid finding good copies for blocks without problems.
> 
> Reported-by: Chris Murphy <lists@colorremedies.com>
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> ---
> v2:
>   - Improve the commit log to be clear, suggested by Eric.
> v3:
>   - Show the commit that introduces this bug, I forgot to add this in the v2
>     version.
> 
> fs/btrfs/extent_io.c | 1 +
> 1 file changed, 1 insertion(+)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 3af4966..be41e4d 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -2602,6 +2602,7 @@ static void end_bio_extent_readpage(struct bio *bio, int err)
> 					test_bit(BIO_UPTODATE, &bio->bi_flags);
> 				if (err)
> 					uptodate = 0;
> +				offset += len;
> 				continue;
> 			}
> 		}
> -- 
> 1.8.1.4
> 
> --

Cannot reproduce with the same steps with kernel 3.17.0-0.rc3.git0.1.fc22.x86_64.


Chris Murphy


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-09-02 20:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-19 15:33 [PATCH] Btrfs: fix crash on endio of reading corrupted block Liu Bo
2014-08-19 19:49 ` Chris Mason
2014-08-19 21:42 ` Eric Sandeen
2014-08-20  8:20   ` Liu Bo
2014-08-20  8:51 ` [PATCH v2] " Liu Bo
2014-08-22 15:32   ` Eric Sandeen
2014-08-23  3:53     ` Liu Bo
2014-08-23  3:59 ` Liu Bo
2014-08-23  4:00 ` [PATCH v3] " Liu Bo
2014-09-02 20:17   ` Chris Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.