From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from fgwmail2.fujitsu.co.jp ([164.71.1.135]:36084 "EHLO fgwmail2.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754752AbaFYHZc (ORCPT ); Wed, 25 Jun 2014 03:25:32 -0400 Received: from kw-mxauth.gw.nic.fujitsu.com (unknown [10.0.237.134]) by fgwmail2.fujitsu.co.jp (Postfix) with ESMTP id 8FA993EE0C7 for ; Wed, 25 Jun 2014 16:25:30 +0900 (JST) Received: from s3.gw.fujitsu.co.jp (s3.gw.nic.fujitsu.com [10.0.50.93]) by kw-mxauth.gw.nic.fujitsu.com (Postfix) with ESMTP id A112EAC07BB for ; Wed, 25 Jun 2014 16:25:29 +0900 (JST) Received: from g01jpfmpwkw02.exch.g01.fujitsu.local (g01jpfmpwkw02.exch.g01.fujitsu.local [10.0.193.56]) by s3.gw.fujitsu.co.jp (Postfix) with ESMTP id 5911A1DB8038 for ; Wed, 25 Jun 2014 16:25:29 +0900 (JST) Message-ID: <53AA794D.2090407@jp.fujitsu.com> Date: Wed, 25 Jun 2014 16:25:01 +0900 From: Satoru Takeuchi MIME-Version: 1.0 To: Liu Bo , linux-btrfs Subject: Re: [PATCH] Btrfs: fix crash when mounting raid5 btrfs with missing disks References: <1403595556-32753-1-git-send-email-bo.li.liu@oracle.com> In-Reply-To: <1403595556-32753-1-git-send-email-bo.li.liu@oracle.com> Content-Type: text/plain; charset="ISO-2022-JP" Sender: linux-btrfs-owner@vger.kernel.org List-ID: Hi Liu, (2014/06/24 16:39), Liu Bo wrote: > The reproducer is > > $ mkfs.btrfs D1 D2 D3 -mraid5 > $ mkfs.ext4 D2 && mkfs.ext4 D3 > $ mount D1 /btrfs -odegraded Tested-by: Satoru Takeuchi Here is the result of the last mount. === ... mount: wrong fs type, bad option, bad superblock on /dev/vdb1, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so. === It "correctly" failed :-) Thanks, Satoru > > ------------------- > > [ 87.672992] ------------[ cut here ]------------ > [ 87.673845] kernel BUG at fs/btrfs/raid56.c:1828! > ... > [ 87.673845] RIP: 0010:[] [] __raid_recover_end_io+0x4ae/0x4d0 > ... > [ 87.673845] Call Trace: > [ 87.673845] [] ? mempool_free+0x36/0xa0 > [ 87.673845] [] raid_recover_end_io+0x75/0xa0 > [ 87.673845] [] bio_endio+0x5b/0xa0 > [ 87.673845] [] bio_endio_nodec+0x12/0x20 > [ 87.673845] [] end_workqueue_fn+0x41/0x50 > [ 87.673845] [] normal_work_helper+0xca/0x2c0 > [ 87.673845] [] process_one_work+0x1eb/0x530 > [ 87.673845] [] ? process_one_work+0x189/0x530 > [ 87.673845] [] worker_thread+0x11b/0x4f0 > [ 87.673845] [] ? rescuer_thread+0x290/0x290 > [ 87.673845] [] kthread+0xe4/0x100 > [ 87.673845] [] ? kthread_create_on_node+0x220/0x220 > [ 87.673845] [] ret_from_fork+0x7c/0xb0 > [ 87.673845] [] ? kthread_create_on_node+0x220/0x220 > > ------------------- > > It's because that we miscalculate @rbio->bbio->error so that it doesn't > reach maximum of tolerable errors while it should have. > > Signed-off-by: Liu Bo > --- > fs/btrfs/raid56.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c > index 4055291..4a88f07 100644 > --- a/fs/btrfs/raid56.c > +++ b/fs/btrfs/raid56.c > @@ -1956,9 +1956,10 @@ static int __raid56_parity_recover(struct btrfs_raid_bio *rbio) > * pages are going to be uptodate. > */ > for (stripe = 0; stripe < bbio->num_stripes; stripe++) { > - if (rbio->faila == stripe || > - rbio->failb == stripe) > + if (rbio->faila == stripe || rbio->failb == stripe) { > + atomic_inc(&rbio->bbio->error); > continue; > + } > > for (pagenr = 0; pagenr < nr_pages; pagenr++) { > struct page *p; >