All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chao Yu <yuchao0@huawei.com>
To: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: linux-f2fs-devel@lists.sourceforge.net
Subject: Re: Possible issues with fsck of f2fs root
Date: Mon, 22 Apr 2019 10:33:50 +0800	[thread overview]
Message-ID: <22c88c02-9854-48fb-70c1-9144edaa0425@huawei.com> (raw)
In-Reply-To: <20190421102703.GC7295@jaegeuk-macbookpro.roam.corp.google.com>

On 2019/4/21 18:27, Jaegeuk Kim wrote:
> On 04/20, Chao Yu wrote:
>> On 2019/4/17 2:53, Jaegeuk Kim wrote:
>>> On 04/02, Hagbard Celine wrote:
>>>> Hi, I lost the root filesystem on my previous install after a few
>>>> weeks of several power outages last winter. While trying to recover I
>>>> discovered that it seem fsck was never run properly during boot in the
>>>> lifetime of that install.
>>>> After getting the system installed again a while ago, I have been
>>>> trying to discern why.
>>>> So far I've found the following two possible issues:
>>>>
>>>> ISSUE 1:
>>>> If I boot with kernel option "ro rootfstype=f2fs
>>>> rootflags=background_gc=on,heap,disable_ext_identify,discard,user_xattr,inline_xattr,inline_dentry,acl,inline_data,flush_merge,data_flush,extent_cache,whint_mode=fs-based,fsync_mode=strict"
>>>> I get the following halfway trough boot:
>>>>
>>>>  * Checking local filesystems  ...
>>>> Info: Use default preen mode
>>>> Info: Mounted device!
>>>> Info: Check FS only due to RO
>>>>         Error: Failed to open the device!
>>>>  * Filesystems couldn't be fixed
>>>>
>>>>
>>>>                      [ !! ] * rc: Aborting!
>>>>
>>>> If i from this state try to mount another partition:
>>>> # mount -o "ro,relatime,lazytime,background_gc=on,discard,heap,user_xattr,inline_xattr,acl,disable_ext_identify,inline_data,inline_dentry,flush_merge,extent_cache,data_flush,mode=adaptive,active_logs=6,whint_mode=fs-based,alloc_mode=default,fsync_mode=strict"
>>>> /dev/nvme0n1p7 /mnt/f2fstest/
>>>>
>>>> I get the same error if I try to run fsck on it:
>>>> # fsck.f2fs /dev/nvme0n1p7
>>>> Info: Mounted device!
>>>> Info: Check FS only due to RO
>>>>         Error: Failed to open the device!
>>>>
>>>> If I on the other had boot with kernel option "rw rootfstype=f2fs
>>>> rootflags=background_gc=on,heap,disable_ext_identify,discard,user_xattr,inline_xattr,inline_dentry,acl,inline_data,flush_merge,data_flush,extent_cache,whint_mode=fs-based,fsync_mode=strict
>>>> panic=30 scsi_mod.use_blk_mq=1"
>>>>
>>>> The boot does not hang and if I try same test as before, mount test partition:
>>>> # mount -o "ro,relatime,lazytime,background_gc=on,discard,heap,user_xattr,inline_xattr,acl,disable_ext_identify,inline_data,inline_dentry,flush_merge,extent_cache,data_flush,mode=adaptive,active_logs=6,whint_mode=fs-based,alloc_mode=default,fsync_mode=strict"
>>>> /dev/nvme0n1p7 /mnt/f2fstest/
>>>>
>>>> Run fsck:
>>>> # fsck.f2fs  -f /dev/nvme0n1p7
>>>> Info: Force to fix corruption
>>>> Info: Mounted device!
>>>> Info: Check FS only due to RO
>>>> Info: Segments per section = 1
>>>> Info: Sections per zone = 1
>>>> Info: sector size = 512
>>>> Info: total sectors = 134101647 (65479 MB)
>>>> Info: MKFS version
>>>>   "Linux version 5.0.5-gentoof2fsfix (root@40o2) (gcc version 8.2.0
>>>> (Gentoo 8.2.0-r6 p1.7)) #2 SMP PREEMPT Mon Apr 1 17:04:41 +01 2019"
>>>> Info: FSCK version
>>>>   from "Linux version 5.0.5-gentoo (root@40o2) (gcc version 8.2.0
>>>> (Gentoo 8.2.0-r6 p1.7)) #2 SMP PREEMPT Tue Apr 2 07:42:40 +01 2019"
>>>>     to "Linux version 5.0.5-gentoo (root@40o2) (gcc version 8.2.0
>>>> (Gentoo 8.2.0-r6 p1.7)) #2 SMP PREEMPT Tue Apr 2 07:42:40 +01 2019"
>>>> Info: superblock features = 0 :
>>>> Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
>>>> Info: total FS sectors = 134101640 (65479 MB)
>>>> Info: CKPT version = 70e1454a
>>>> Info: Checked valid nat_bits in checkpoint
>>>> Info: checkpoint state = 4c1 :  large_nat_bitmap nat_bits crc unmount
>>>>
>>>> [FSCK] Unreachable nat entries                        [Ok..] [0x0]
>>>> [FSCK] SIT valid block bitmap checking                [Ok..]
>>>> [FSCK] Hard link checking for regular file            [Ok..] [0x70]
>>>> [FSCK] valid_block_count matching with CP             [Ok..] [0x1fe244]
>>>> [FSCK] valid_node_count matcing with CP (de lookup)   [Ok..] [0x6c487]
>>>> [FSCK] valid_node_count matcing with CP (nat lookup)  [Ok..] [0x6c487]
>>>> [FSCK] valid_inode_count matched with CP              [Ok..] [0x6c362]
>>>> [FSCK] free segment_count matched with CP             [Ok..] [0x6c44]
>>>> [FSCK] next block offset is free                      [Ok..]
>>>> [FSCK] fixing SIT types
>>>> [FSCK] other corrupted bugs                           [Ok..]
>>>>
>>>> Done.
>>>>
>>>> So a system booted with "rw" root can fsck an "ro" filesystem but a
>>>> system booted with root "ro" can not.
>>>>
>>>>
>>>> ISSUE 2:
>>>> Referring to the output from the fsck running against a "ro"
>>>> filesystem, especially this line:
>>>> Info: Check FS only due to RO
>>>>
>>>> As far as i can tell this says that opposed to other filesystems
>>>> running fsck against a "ro" mounted f2fs partition will never fix any
>>>> errors.
>>>> So I tried running fsck against the same partition mounted "rw":
>>>> # mount -o remount,rw /mnt/f2fstest/
>>>> # fsck.f2fs  -f /dev/nvme0n1p7
>>>> Info: Force to fix corruption
>>>> Info: Mounted device!
>>>>         Error: Not available on mounted device!
>>>>
>>>> I might be misunderstanding something, but all this tells me that
>>>> unless one make a custom initramfs that runs fsck before root is
>>>> mounted (something no distributions has, as far as I know), fsck will
>>>> never fix an f2fs formatted root partition during boot.
>>>> If this is by design and not a bug/unintended behavior, it should be
>>>> documented somewhere least more people will experience system crashes
>>>> like mine.
>>>>
>>>> All tests above done with kernel 5.0.5 and f2fs-tools 1.12.0 with
>>>> "fsck.f2fs: allow to fsck readonly image w/ -f option"-patch by Chao
>>>> Yu.
>>>
>>> Hi Hagbard,
>>>
>>> It looks like fsck.f2fs failed to open a device as RW on RO disk. Could you
>>> try this patch?
>>>
>>> Thanks,
>>>
>>> >From 3f18ff744f4d510d8e2f42c5a3b2539651baccc5 Mon Sep 17 00:00:00 2001
>>> From: Jaegeuk Kim <jaegeuk@kernel.org>
>>> Date: Tue, 16 Apr 2019 11:46:31 -0700
>>> Subject: [PATCH] fsck.f2fs: open ro disk if we want to check fs only
>>>
>>> This patch fixes the "open failure" issue on ro disk, reported by Hagbard.
>>>
>>> "
>>>  If I boot with kernel option "ro rootfstype=f2fs
>>>  I get the following halfway trough boot:
>>>
>>>   * Checking local filesystems  ...
>>>  Info: Use default preen mode
>>>  Info: Mounted device!
>>>  Info: Check FS only due to RO
>>>          Error: Failed to open the device!
>>>   * Filesystems couldn't be fixed
>>> "
>>>
>>> Reported-by: Hagbard Celine <hagbardcelin@gmail.com>
>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>>> ---
>>>  lib/libf2fs.c | 13 ++++++++++---
>>>  1 file changed, 10 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/lib/libf2fs.c b/lib/libf2fs.c
>>> index f8f6921..1a0d179 100644
>>> --- a/lib/libf2fs.c
>>> +++ b/lib/libf2fs.c
>>> @@ -818,9 +818,16 @@ int get_device_info(int i)
>>>  	unsigned char model_inq[6] = {MODELINQUIRY};
>>>  #endif
>>>  	struct device_info *dev = c.devices + i;
>>> +	int rw_flag;
>>> +
>>> +	/* Check FS only */
>>> +	if (c.fix_on == 0 && c.auto_fix == 0)
>>> +		rw_flag = O_RDONLY;
>>> +	else
>>> +		rw_flag = O_RDWR;
>>>  
>>>  	if (c.sparse_mode) {
>>> -		fd = open(dev->path, O_RDWR | O_CREAT | O_BINARY, 0644);
>>> +		fd = open(dev->path, rw_flag | O_CREAT | O_BINARY, 0644);
>>>  		if (fd < 0) {
>>>  			MSG(0, "\tError: Failed to open a sparse file!\n");
>>>  			return -1;
>>> @@ -838,9 +845,9 @@ int get_device_info(int i)
>>>  		}
>>>  
>>>  		if (S_ISBLK(stat_buf->st_mode) && !c.force)
>>> -			fd = open(dev->path, O_RDWR | O_EXCL);
>>> +			fd = open(dev->path, rw_flag | O_EXCL);
>>>  		else
>>> -			fd = open(dev->path, O_RDWR);
>>> +			fd = open(dev->path, rw_flag);
>>>  	}
>>>  	if (fd < 0) {
>>>  		MSG(0, "\tError: Failed to open the device!\n");
>>
>> Jaegeuk,
>>
>> Last merged patch wasn't sent out..., so I just reply on this old one.
>>
>> 		if (S_ISBLK(stat_buf->st_mode) && !c.force) {
>>
>> Shouldn't be (.. && c.force) ?

The point I replied is wrong, please ignore this. :P

> 
> This has nothing to do with this patch tho, it looks good since, in normal case,
> we'd better O_EXCL, but in the force mode, we'd like to open the device freely.
> 
> New version of the patch is:
> 
>>From 3221692b060649378f1f69b898ed85a814af3dbf Mon Sep 17 00:00:00 2001
> From: Jaegeuk Kim <jaegeuk@kernel.org>
> Date: Tue, 16 Apr 2019 11:46:31 -0700
> Subject: [PATCH] fsck.f2fs: open ro disk if we want to check fs only
> 
> This patch fixes the "open failure" issue on ro disk, reported by Hagbard.
> 
> "
>  If I boot with kernel option "ro rootfstype=f2fs
>  I get the following halfway trough boot:
> 
>   * Checking local filesystems  ...
>  Info: Use default preen mode
>  Info: Mounted device!
>  Info: Check FS only due to RO
>          Error: Failed to open the device!
>   * Filesystems couldn't be fixed

The behavior above is the same as ext4, we don't need to check that.

The different here in between and ext4 and f2fs, is on ro mounted image, ext4
and check and fix issues, but f2fs only do the check, so, that's what Hagbard
complained, since there is no way to repair a ro or rw mounted image with
fsck.f2fs...

Thanks,

> "
> 
> Reported-by: Hagbard Celine <hagbardcelin@gmail.com>
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>  lib/libf2fs.c | 25 +++++++++++++++++++++----
>  1 file changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/libf2fs.c b/lib/libf2fs.c
> index d30047f..853e713 100644
> --- a/lib/libf2fs.c
> +++ b/lib/libf2fs.c
> @@ -789,6 +789,15 @@ void get_kernel_uname_version(__u8 *version)
>  #endif /* APPLE_DARWIN */
>  
>  #ifndef ANDROID_WINDOWS_HOST
> +static int open_check_fs(char *path, int flag)
> +{
> +	if (c.func != FSCK || c.fix_on || c.auto_fix)
> +		return -1;
> +
> +	/* allow to open ro */
> +	return open(path, O_RDONLY | flag);
> +}
> +
>  int get_device_info(int i)
>  {
>  	int32_t fd = 0;
> @@ -810,8 +819,11 @@ int get_device_info(int i)
>  	if (c.sparse_mode) {
>  		fd = open(dev->path, O_RDWR | O_CREAT | O_BINARY, 0644);
>  		if (fd < 0) {
> -			MSG(0, "\tError: Failed to open a sparse file!\n");
> -			return -1;
> +			fd = open_check_fs(dev->path, O_BINARY);
> +			if (fd < 0) {
> +				MSG(0, "\tError: Failed to open a sparse file!\n");
> +				return -1;
> +			}
>  		}
>  	}
>  
> @@ -825,10 +837,15 @@ int get_device_info(int i)
>  			return -1;
>  		}
>  
> -		if (S_ISBLK(stat_buf->st_mode) && !c.force)
> +		if (S_ISBLK(stat_buf->st_mode) && !c.force) {
>  			fd = open(dev->path, O_RDWR | O_EXCL);
> -		else
> +			if (fd < 0)
> +				fd = open_check_fs(dev->path, O_EXCL);
> +		} else {
>  			fd = open(dev->path, O_RDWR);
> +			if (fd < 0)
> +				fd = open_check_fs(dev->path, 0);
> +		}
>  	}
>  	if (fd < 0) {
>  		MSG(0, "\tError: Failed to open the device!\n");
> 

  reply	other threads:[~2019-04-22  2:34 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-02 19:29 Possible issues with fsck of f2fs root Hagbard Celine
2019-04-16 18:53 ` Jaegeuk Kim
2019-04-20  2:34   ` Chao Yu
2019-04-21 10:27     ` Jaegeuk Kim
2019-04-22  2:33       ` Chao Yu [this message]
2019-04-22  7:11       ` Hagbard Celine
2019-04-22  7:37         ` Chao Yu
2019-04-22  9:05           ` Hagbard Celine
2019-04-22  9:26             ` Chao Yu
2019-04-22 10:05               ` Hagbard Celine
2019-04-23  2:55                 ` Chao Yu
2019-04-23 11:59                   ` Hagbard Celine
2019-04-23 12:18                     ` Hagbard Celine
2019-04-23 16:17                   ` Hagbard Celine
2019-04-24  7:07                     ` Chao Yu
2020-07-24  8:11                       ` [f2fs-dev] " Norbert Lange
2020-07-25  2:06                         ` Chao Yu
2020-07-27 15:02                           ` Michael Laß
2020-07-31  9:08                             ` Chao Yu
2019-04-22  2:21 ` Chao Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22c88c02-9854-48fb-70c1-9144edaa0425@huawei.com \
    --to=yuchao0@huawei.com \
    --cc=jaegeuk@kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.