linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/4] Btrfs-progs: fsck: only allow partial opening under repair mode
@ 2014-05-29  9:59 Wang Shilong
  2014-05-29  9:59 ` [PATCH v2 3/4] Btrfs-progs: fsck: deal with corrupted csum root Wang Shilong
  2014-06-02 16:18 ` [PATCH v2 1/4] Btrfs-progs: fsck: only allow partial opening under repair mode David Sterba
  0 siblings, 2 replies; 8+ messages in thread
From: Wang Shilong @ 2014-05-29  9:59 UTC (permalink / raw)
  To: linux-btrfs

The reason that we allow partial opening is that sometimes,
we may have some corrupted trees.(for example extent tree), for
fsck repair case, the broken tree may be rebuilt later.

So if users only want to do check but not repair anything, this
patch will make fsck return failure as soon as possible and
tell users that some critial roots have been corrupted.

Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
---
v1->v2: add necessary changelog.(Thanks to Eric)
---
 cmds-check.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index db7df80..0e4e042 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -6810,8 +6810,7 @@ int cmd_check(int argc, char **argv)
 	int option_index = 0;
 	int init_csum_tree = 0;
 	int qgroup_report = 0;
-	enum btrfs_open_ctree_flags ctree_flags =
-		OPEN_CTREE_PARTIAL | OPEN_CTREE_EXCLUSIVE;
+	enum btrfs_open_ctree_flags ctree_flags = OPEN_CTREE_EXCLUSIVE;
 
 	while(1) {
 		int c;
@@ -6877,6 +6876,10 @@ int cmd_check(int argc, char **argv)
 		goto err_out;
 	}
 
+	/* only allow partial opening under repair mode */
+	if (repair)
+		ctree_flags |= OPEN_CTREE_PARTIAL;
+
 	info = open_ctree_fs_info(argv[optind], bytenr, 0, ctree_flags);
 	if (!info) {
 		fprintf(stderr, "Couldn't open file system\n");
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/4] Btrfs-progs: fsck: deal with corrupted csum root
  2014-05-29  9:59 [PATCH v2 1/4] Btrfs-progs: fsck: only allow partial opening under repair mode Wang Shilong
@ 2014-05-29  9:59 ` Wang Shilong
  2014-06-02 17:27   ` David Sterba
  2014-06-02 16:18 ` [PATCH v2 1/4] Btrfs-progs: fsck: only allow partial opening under repair mode David Sterba
  1 sibling, 1 reply; 8+ messages in thread
From: Wang Shilong @ 2014-05-29  9:59 UTC (permalink / raw)
  To: linux-btrfs

If checksum root is corrupted, fsck will get segmentation. This
is because if we fail to load checksum root, root's node is NULL which
cause NULL pointer deferences later.

To fix this problem, we just did something like extent tree rebuilding.
Allocate a new one and clear uptodate flag. We will do sanity check
before fsck going on.

Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
---
v1->v2: fix typo for output message.
---
 cmds-check.c | 5 +++++
 disk-io.c    | 7 +++++++
 2 files changed, 12 insertions(+)

diff --git a/cmds-check.c b/cmds-check.c
index 0e4e042..ad5514e 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -6963,6 +6963,11 @@ int cmd_check(int argc, char **argv)
 		ret = -EIO;
 		goto close_out;
 	}
+	if (!extent_buffer_uptodate(info->csum_root->node)) {
+		fprintf(stderr, "Checksum root corrupted, rerun with --init-csum-tree option\n");
+		ret = -EIO;
+		goto close_out;
+	}
 
 	fprintf(stderr, "checking extents\n");
 	ret = check_chunks_and_extents(root);
diff --git a/disk-io.c b/disk-io.c
index 63e153d..bbfd8e7 100644
--- a/disk-io.c
+++ b/disk-io.c
@@ -914,6 +914,13 @@ int btrfs_setup_all_roots(struct btrfs_fs_info *fs_info, u64 root_tree_bytenr,
 		printk("Couldn't setup csum tree\n");
 		if (!(flags & OPEN_CTREE_PARTIAL))
 			return -EIO;
+		/* do the same thing as extent tree rebuilding */
+		fs_info->csum_root->node =
+			btrfs_find_create_tree_block(fs_info->extent_root, 0,
+						     leafsize);
+		if (!fs_info->csum_root->node)
+			return -ENOMEM;
+		clear_extent_buffer_uptodate(NULL, fs_info->csum_root->node);
 	}
 	fs_info->csum_root->track_dirty = 1;
 
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/4] Btrfs-progs: fsck: only allow partial opening under repair mode
  2014-05-29  9:59 [PATCH v2 1/4] Btrfs-progs: fsck: only allow partial opening under repair mode Wang Shilong
  2014-05-29  9:59 ` [PATCH v2 3/4] Btrfs-progs: fsck: deal with corrupted csum root Wang Shilong
@ 2014-06-02 16:18 ` David Sterba
  2014-06-03  3:35   ` Wang Shilong
  1 sibling, 1 reply; 8+ messages in thread
From: David Sterba @ 2014-06-02 16:18 UTC (permalink / raw)
  To: Wang Shilong; +Cc: linux-btrfs

On Thu, May 29, 2014 at 05:59:56PM +0800, Wang Shilong wrote:
> The reason that we allow partial opening is that sometimes,
> we may have some corrupted trees.(for example extent tree), for
> fsck repair case, the broken tree may be rebuilt later.
> 
> So if users only want to do check but not repair anything, this
> patch will make fsck return failure as soon as possible and
> tell users that some critial roots have been corrupted.

Ok, that partially answers my question under v1. This would be a
different mode, eg. a fast check, that would bail out quickly as you
intend. I'd really want to keep the (full) check and repair to do the
same sort of checks and verification.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/4] Btrfs-progs: fsck: deal with corrupted csum root
  2014-05-29  9:59 ` [PATCH v2 3/4] Btrfs-progs: fsck: deal with corrupted csum root Wang Shilong
@ 2014-06-02 17:27   ` David Sterba
  2014-06-03  3:25     ` Wang Shilong
  0 siblings, 1 reply; 8+ messages in thread
From: David Sterba @ 2014-06-02 17:27 UTC (permalink / raw)
  To: Wang Shilong; +Cc: linux-btrfs, clm

On Thu, May 29, 2014 at 05:59:57PM +0800, Wang Shilong wrote:
> If checksum root is corrupted, fsck will get segmentation. This
> is because if we fail to load checksum root, root's node is NULL which
> cause NULL pointer deferences later.
> 
> To fix this problem, we just did something like extent tree rebuilding.
> Allocate a new one and clear uptodate flag. We will do sanity check
> before fsck going on.

I'm a bit worried about recommending --init-csum-root, though in this
case there's not much else left to do. A filesystem with initialized
csum tree will mount, but reading non-inline data will produce 'csum
missing' errors.

> --- a/cmds-check.c
> +++ b/cmds-check.c
> @@ -6963,6 +6963,11 @@ int cmd_check(int argc, char **argv)
>  		ret = -EIO;
>  		goto close_out;
>  	}
> +	if (!extent_buffer_uptodate(info->csum_root->node)) {
> +		fprintf(stderr, "Checksum root corrupted, rerun with --init-csum-tree option\n");
> +		ret = -EIO;
> +		goto close_out;

So this should prevent segfaults due to missing csum tree, fine. The
error message can copy what the broken extent tree reports a few lines
above.

And now that I'm looking at other extent_buffer_uptodate(tree) checks in
the function, for clarity, each root check should be done separately and
followed by a message that says which tree is broken.

The idea behind this is to do improve the error reporting and then
document what type of breakage can be fixed and how.

I'm CCing Chris, as this is a matter of design and direction of fsck,
more oppinions are desirable.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/4] Btrfs-progs: fsck: deal with corrupted csum root
  2014-06-02 17:27   ` David Sterba
@ 2014-06-03  3:25     ` Wang Shilong
  2014-06-03 16:21       ` David Sterba
  0 siblings, 1 reply; 8+ messages in thread
From: Wang Shilong @ 2014-06-03  3:25 UTC (permalink / raw)
  To: dsterba, linux-btrfs, clm

On 06/03/2014 01:27 AM, David Sterba wrote:
> On Thu, May 29, 2014 at 05:59:57PM +0800, Wang Shilong wrote:
>> If checksum root is corrupted, fsck will get segmentation. This
>> is because if we fail to load checksum root, root's node is NULL which
>> cause NULL pointer deferences later.
>>
>> To fix this problem, we just did something like extent tree rebuilding.
>> Allocate a new one and clear uptodate flag. We will do sanity check
>> before fsck going on.
> I'm a bit worried about recommending --init-csum-root, though in this
> case there's not much else left to do. A filesystem with initialized
> csum tree will mount, but reading non-inline data will produce 'csum
> missing' errors.
Agree.
>> --- a/cmds-check.c
>> +++ b/cmds-check.c
>> @@ -6963,6 +6963,11 @@ int cmd_check(int argc, char **argv)
>>   		ret = -EIO;
>>   		goto close_out;
>>   	}
>> +	if (!extent_buffer_uptodate(info->csum_root->node)) {
>> +		fprintf(stderr, "Checksum root corrupted, rerun with --init-csum-tree option\n");
>> +		ret = -EIO;
>> +		goto close_out;
> So this should prevent segfaults due to missing csum tree, fine. The
> error message can copy what the broken extent tree reports a few lines
> above.
>
> And now that I'm looking at other extent_buffer_uptodate(tree) checks in
> the function, for clarity, each root check should be done separately and
> followed by a message that says which tree is broken.
Normally, extent_buffer_update(tree) is called after reading.
We need this in fsck is because we need reinit extent tree and csum tree.

check it again is to make sure root node has been setup properly and
fsck can go further..


>
> The idea behind this is to do improve the error reporting and then
> document what type of breakage can be fixed and how.
>
> I'm CCing Chris, as this is a matter of design and direction of fsck,
> more oppinions are desirable.
> .
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/4] Btrfs-progs: fsck: only allow partial opening under repair mode
  2014-06-02 16:18 ` [PATCH v2 1/4] Btrfs-progs: fsck: only allow partial opening under repair mode David Sterba
@ 2014-06-03  3:35   ` Wang Shilong
  0 siblings, 0 replies; 8+ messages in thread
From: Wang Shilong @ 2014-06-03  3:35 UTC (permalink / raw)
  To: dsterba, linux-btrfs

On 06/03/2014 12:18 AM, David Sterba wrote:
> On Thu, May 29, 2014 at 05:59:56PM +0800, Wang Shilong wrote:
>> The reason that we allow partial opening is that sometimes,
>> we may have some corrupted trees.(for example extent tree), for
>> fsck repair case, the broken tree may be rebuilt later.
>>
>> So if users only want to do check but not repair anything, this
>> patch will make fsck return failure as soon as possible and
>> tell users that some critial roots have been corrupted.
> Ok, that partially answers my question under v1. This would be a
> different mode, eg. a fast check, that would bail out quickly as you
> intend. I'd really want to keep the (full) check and repair to do the
Mm...That is reasonable too..

Acutally, now fsck would bail out if if found some errors. For example
if we fail to check csum tree, it won't check fs root.

I don't have ideas that whether fsck should continue if error happen,
for example, logic error, enomem....

> same sort of checks and verification.
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/4] Btrfs-progs: fsck: deal with corrupted csum root
  2014-06-03  3:25     ` Wang Shilong
@ 2014-06-03 16:21       ` David Sterba
  2014-06-04  1:43         ` Wang Shilong
  0 siblings, 1 reply; 8+ messages in thread
From: David Sterba @ 2014-06-03 16:21 UTC (permalink / raw)
  To: Wang Shilong; +Cc: dsterba, linux-btrfs, clm

On Tue, Jun 03, 2014 at 11:25:49AM +0800, Wang Shilong wrote:
> On 06/03/2014 01:27 AM, David Sterba wrote:
> >On Thu, May 29, 2014 at 05:59:57PM +0800, Wang Shilong wrote:
> >>If checksum root is corrupted, fsck will get segmentation. This
> >>is because if we fail to load checksum root, root's node is NULL which
> >>cause NULL pointer deferences later.
> >>
> >>To fix this problem, we just did something like extent tree rebuilding.
> >>Allocate a new one and clear uptodate flag. We will do sanity check
> >>before fsck going on.
> >I'm a bit worried about recommending --init-csum-root, though in this
> >case there's not much else left to do. A filesystem with initialized
> >csum tree will mount, but reading non-inline data will produce 'csum
> >missing' errors.
> Agree.

Are you ok with removing the "rerun with --init-csum-tree option" part
of the message?

> >>--- a/cmds-check.c
> >>+++ b/cmds-check.c
> >>@@ -6963,6 +6963,11 @@ int cmd_check(int argc, char **argv)
> >>  		ret = -EIO;
> >>  		goto close_out;
> >>  	}
> >>+	if (!extent_buffer_uptodate(info->csum_root->node)) {
> >>+		fprintf(stderr, "Checksum root corrupted, rerun with --init-csum-tree option\n");
> >>+		ret = -EIO;
> >>+		goto close_out;
> >So this should prevent segfaults due to missing csum tree, fine. The
> >error message can copy what the broken extent tree reports a few lines
> >above.
> >
> >And now that I'm looking at other extent_buffer_uptodate(tree) checks in
> >the function, for clarity, each root check should be done separately and
> >followed by a message that says which tree is broken.
> Normally, extent_buffer_update(tree) is called after reading.
> We need this in fsck is because we need reinit extent tree and csum tree.
> 
> check it again is to make sure root node has been setup properly and
> fsck can go further..

Yeah, I see how it works now, thanks.

I've reorganized the patches in integration so the ones for fsck are
grouped together. Fsck is scary and needs more reviews obviously, so the
patches will be pushed towards release branches based on that. Reviews
or tests so to say. I appreciate your work in that area and hope you
understand the slow progress with your patches.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/4] Btrfs-progs: fsck: deal with corrupted csum root
  2014-06-03 16:21       ` David Sterba
@ 2014-06-04  1:43         ` Wang Shilong
  0 siblings, 0 replies; 8+ messages in thread
From: Wang Shilong @ 2014-06-04  1:43 UTC (permalink / raw)
  To: dsterba, linux-btrfs, clm

On 06/04/2014 12:21 AM, David Sterba wrote:
> On Tue, Jun 03, 2014 at 11:25:49AM +0800, Wang Shilong wrote:
>> On 06/03/2014 01:27 AM, David Sterba wrote:
>>> On Thu, May 29, 2014 at 05:59:57PM +0800, Wang Shilong wrote:
>>>> If checksum root is corrupted, fsck will get segmentation. This
>>>> is because if we fail to load checksum root, root's node is NULL which
>>>> cause NULL pointer deferences later.
>>>>
>>>> To fix this problem, we just did something like extent tree rebuilding.
>>>> Allocate a new one and clear uptodate flag. We will do sanity check
>>>> before fsck going on.
>>> I'm a bit worried about recommending --init-csum-root, though in this
>>> case there's not much else left to do. A filesystem with initialized
>>> csum tree will mount, but reading non-inline data will produce 'csum
>>> missing' errors.
>> Agree.
> Are you ok with removing the "rerun with --init-csum-tree option" part
> of the message?
That's not good, i agree with your point here.
>
>>>> --- a/cmds-check.c
>>>> +++ b/cmds-check.c
>>>> @@ -6963,6 +6963,11 @@ int cmd_check(int argc, char **argv)
>>>>   		ret = -EIO;
>>>>   		goto close_out;
>>>>   	}
>>>> +	if (!extent_buffer_uptodate(info->csum_root->node)) {
>>>> +		fprintf(stderr, "Checksum root corrupted, rerun with --init-csum-tree option\n");
>>>> +		ret = -EIO;
>>>> +		goto close_out;
>>> So this should prevent segfaults due to missing csum tree, fine. The
>>> error message can copy what the broken extent tree reports a few lines
>>> above.
>>>
>>> And now that I'm looking at other extent_buffer_uptodate(tree) checks in
>>> the function, for clarity, each root check should be done separately and
>>> followed by a message that says which tree is broken.
>> Normally, extent_buffer_update(tree) is called after reading.
>> We need this in fsck is because we need reinit extent tree and csum tree.
>>
>> check it again is to make sure root node has been setup properly and
>> fsck can go further..
> Yeah, I see how it works now, thanks.
>
> I've reorganized the patches in integration so the ones for fsck are
> grouped together. Fsck is scary and needs more reviews obviously, so the
> patches will be pushed towards release branches based on that. Reviews
> or tests so to say. I appreciate your work in that area and hope you
> understand the slow progress with your patches.
That's ok for me, thanks for your review and comments^_^

> .
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-06-04  1:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-29  9:59 [PATCH v2 1/4] Btrfs-progs: fsck: only allow partial opening under repair mode Wang Shilong
2014-05-29  9:59 ` [PATCH v2 3/4] Btrfs-progs: fsck: deal with corrupted csum root Wang Shilong
2014-06-02 17:27   ` David Sterba
2014-06-03  3:25     ` Wang Shilong
2014-06-03 16:21       ` David Sterba
2014-06-04  1:43         ` Wang Shilong
2014-06-02 16:18 ` [PATCH v2 1/4] Btrfs-progs: fsck: only allow partial opening under repair mode David Sterba
2014-06-03  3:35   ` Wang Shilong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).