All of lore.kernel.org
 help / color / mirror / Atom feed
* testing result of loop-aio patchset on ext3
@ 2014-07-14  9:34 Rui Xiang
  2014-07-14  9:51 ` Lukáš Czerner
  0 siblings, 1 reply; 11+ messages in thread
From: Rui Xiang @ 2014-07-14  9:34 UTC (permalink / raw)
  To: Dave Kleikamp, linux-ext4; +Cc: linux-fsdevel, linux-kernel, Li Zefan

Hi Dave,

We export a container image file as a block device via loop device, but we
found it's very easy that the container rootfs gets corrupted due to power
loss.

Your early version of loop-aio patchset said the patchset can make loop
mounted filesystems recoverable(lkml.org/lkml/2012/3/30/317), but we found
it doesn't help.

Both the guest fs and host fs are ext3.

The loop-aio patchset is from:
git://github.com/kleikamp/linux-shaggy.git aio_loop

Steps:
1. dd a 10G image, mkfs.ext3,
  # dd if=/dev/zero of=./raw_image bs=1M count=10000
  # echo y | mkfs.ext3 raw_image

2. losetup a loop device, mount at ./test_dir
  # losetup /dev/loop1 raw_image
  # mount /dev/loop1 ./test_dir

3. copy fs_mark into test_dir and run
  # ./fs_mark -d ./tmp/ -s 102400000 -n 80

4. during runing fs_mark, make systerm reboot indirectly.
  # echo b > /proc/sysrq-trigger

After systerm booted up, sometimes fsck reported raw_image fs has been damaged.

# fsck.ext3 -n raw_image
e2fsck 1.41.9 (22-Aug-2009)
Warning: skipping journal recovery because doing a read-only filesystem check.
raw_image contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (2481348, counted=2480577).
Fix? no
Free inodes count wrong (640837, counted=640835).
Fix? no
raw_image: ********** WARNING: Filesystem still has errors **********
raw_image: 11/640848 files (0.0% non-contiguous), 78652/2560000 blocks


With a specific script, I can almost 100% reproduce this issue.

And it seems the corruption can only happen when reboot happens at the
time loop is calling vfs_fsync().

Do you have any idea why the loop-aio patchset doesn't help?

Thanks.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: testing result of loop-aio patchset on ext3
  2014-07-14  9:34 testing result of loop-aio patchset on ext3 Rui Xiang
@ 2014-07-14  9:51 ` Lukáš Czerner
  2014-07-16  3:54     ` Rui Xiang
  0 siblings, 1 reply; 11+ messages in thread
From: Lukáš Czerner @ 2014-07-14  9:51 UTC (permalink / raw)
  To: Rui Xiang
  Cc: Dave Kleikamp, linux-ext4, linux-fsdevel, linux-kernel, Li Zefan

On Mon, 14 Jul 2014, Rui Xiang wrote:

> Date: Mon, 14 Jul 2014 17:34:38 +0800
> From: Rui Xiang <rui.xiang@huawei.com>
> To: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org
> Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>     Li Zefan <lizefan@huawei.com>
> Subject: testing result of loop-aio patchset on ext3
> 
> Hi Dave,
> 
> We export a container image file as a block device via loop device, but we
> found it's very easy that the container rootfs gets corrupted due to power
> loss.
> 
> Your early version of loop-aio patchset said the patchset can make loop
> mounted filesystems recoverable(lkml.org/lkml/2012/3/30/317), but we found
> it doesn't help.
> 
> Both the guest fs and host fs are ext3.
> 
> The loop-aio patchset is from:
> git://github.com/kleikamp/linux-shaggy.git aio_loop
> 
> Steps:
> 1. dd a 10G image, mkfs.ext3,
>   # dd if=/dev/zero of=./raw_image bs=1M count=10000
>   # echo y | mkfs.ext3 raw_image
> 
> 2. losetup a loop device, mount at ./test_dir
>   # losetup /dev/loop1 raw_image
>   # mount /dev/loop1 ./test_dir
> 
> 3. copy fs_mark into test_dir and run
>   # ./fs_mark -d ./tmp/ -s 102400000 -n 80
> 
> 4. during runing fs_mark, make systerm reboot indirectly.
>   # echo b > /proc/sysrq-trigger
> 
> After systerm booted up, sometimes fsck reported raw_image fs has been damaged.
> 
> # fsck.ext3 -n raw_image
> e2fsck 1.41.9 (22-Aug-2009)
> Warning: skipping journal recovery because doing a read-only filesystem check.
> raw_image contains a file system with errors, check forced.
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> Free blocks count wrong (2481348, counted=2480577).
> Fix? no
> Free inodes count wrong (640837, counted=640835).
> Fix? no
> raw_image: ********** WARNING: Filesystem still has errors **********
> raw_image: 11/640848 files (0.0% non-contiguous), 78652/2560000 blocks

It's not damaged, this is expected result if you're using old
e2fsprogs which still treats this as an error.

It's not an error because we only update superblock summary at
unmount time so with unclean shutdown it's likely that it does not
match the reality, but e2fsck can and will easily fix that for you.

Please try e2fsprogs v1.42.3 or newer.

Thanks!
-Lukas

> 
> 
> With a specific script, I can almost 100% reproduce this issue.
> 
> And it seems the corruption can only happen when reboot happens at the
> time loop is calling vfs_fsync().
> 
> Do you have any idea why the loop-aio patchset doesn't help?
> 
> Thanks.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: testing result of loop-aio patchset on ext3
  2014-07-14  9:51 ` Lukáš Czerner
@ 2014-07-16  3:54     ` Rui Xiang
  0 siblings, 0 replies; 11+ messages in thread
From: Rui Xiang @ 2014-07-16  3:54 UTC (permalink / raw)
  To: Lukáš Czerner
  Cc: Dave Kleikamp, linux-ext4, linux-fsdevel, linux-kernel, Li Zefan

On 2014/7/14 17:51, Lukáš Czerner wrote:
> On Mon, 14 Jul 2014, Rui Xiang wrote:
> 
>> Date: Mon, 14 Jul 2014 17:34:38 +0800
>> From: Rui Xiang <rui.xiang@huawei.com>
>> To: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org
>> Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>>     Li Zefan <lizefan@huawei.com>
>> Subject: testing result of loop-aio patchset on ext3
>>
>> Hi Dave,
>>
>> We export a container image file as a block device via loop device, but we
>> found it's very easy that the container rootfs gets corrupted due to power
>> loss.
>>
>> Your early version of loop-aio patchset said the patchset can make loop
>> mounted filesystems recoverable(lkml.org/lkml/2012/3/30/317), but we found
>> it doesn't help.
>>
>> Both the guest fs and host fs are ext3.
>>
>> The loop-aio patchset is from:
>> git://github.com/kleikamp/linux-shaggy.git aio_loop
>>
>> Steps:
>> 1. dd a 10G image, mkfs.ext3,
>>   # dd if=/dev/zero of=./raw_image bs=1M count=10000
>>   # echo y | mkfs.ext3 raw_image
>>
>> 2. losetup a loop device, mount at ./test_dir
>>   # losetup /dev/loop1 raw_image
>>   # mount /dev/loop1 ./test_dir
>>
>> 3. copy fs_mark into test_dir and run
>>   # ./fs_mark -d ./tmp/ -s 102400000 -n 80
>>
>> 4. during runing fs_mark, make systerm reboot indirectly.
>>   # echo b > /proc/sysrq-trigger
>>
>> After systerm booted up, sometimes fsck reported raw_image fs has been damaged.
>>
>> # fsck.ext3 -n raw_image
>> e2fsck 1.41.9 (22-Aug-2009)
>> Warning: skipping journal recovery because doing a read-only filesystem check.
>> raw_image contains a file system with errors, check forced.
>> Pass 1: Checking inodes, blocks, and sizes
>> Pass 2: Checking directory structure
>> Pass 3: Checking directory connectivity
>> Pass 4: Checking reference counts
>> Pass 5: Checking group summary information
>> Free blocks count wrong (2481348, counted=2480577).
>> Fix? no
>> Free inodes count wrong (640837, counted=640835).
>> Fix? no
>> raw_image: ********** WARNING: Filesystem still has errors **********
>> raw_image: 11/640848 files (0.0% non-contiguous), 78652/2560000 blocks
> 
> It's not damaged, this is expected result if you're using old
> e2fsprogs which still treats this as an error.
> 
> It's not an error because we only update superblock summary at
> unmount time so with unclean shutdown it's likely that it does not
> match the reality, but e2fsck can and will easily fix that for you.
> 
> Please try e2fsprogs v1.42.3 or newer.
> 

Hi Lukas,

I updated e2fsprogs to v1.42.3, and user the newer fsck.ext3 to check raw_image.
Exactly, the result seemed normal.

Then, I continue my previous test. And after testing 35 times, "fsck -n" reported image fs
had been damaged, too.

 # fsck.ext3 -n image1
e2fsck 1.42.3.wc1 (28-May-2012)
Warning: skipping journal recovery because doing a read-only filesystem check.
image1 has been mounted 36 times without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Inode 16407, i_size is 597447, should be 602112.  Fix? no
Inode 16407, i_blocks is 1176, should be 1184.  Fix? no
Inode 409941, i_blocks is 200208, should be 112.  Fix? no
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences:  -1506836 -1506843 -(1506859--1506860) -(1660941--1661964) -(1661966--1671167) -(1671688--1686473)
Fix? no
Free blocks count wrong for group #2 (31558, counted=31556).
Fix? no
Free blocks count wrong for group #43 (15871, counted=15867).
Fix? no
Free blocks count wrong (2204041, counted=2204035).
Fix? no
image1: ********** WARNING: Filesystem still has errors **********
image1: 13008/655360 files (0.3% non-contiguous), 417399/2621440 blocks

I backup the image to image_bk, and then mount the image to a dir, and cat all files in the image.
Steps:
# dd if=image1 of=image_bk
# mount image1 err_dir
# find -name '*' -exec cat > /dev/null {} \;

There are no issues during catting, and no err in dmesg too.

*But when I umount the image1 from err_dir, The fsck result didn't show any fs corruption info.

I mount image_bk to err_dir and umount it with no operation directly. The result is same to iamge1.

*So, is fs in the image as a block device via loop device damaged really, or does it have some others issues? 
Could you give me some opinions?


Thanks.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: testing result of loop-aio patchset on ext3
@ 2014-07-16  3:54     ` Rui Xiang
  0 siblings, 0 replies; 11+ messages in thread
From: Rui Xiang @ 2014-07-16  3:54 UTC (permalink / raw)
  To: Lukáš Czerner
  Cc: Dave Kleikamp, linux-ext4, linux-fsdevel, linux-kernel, Li Zefan

On 2014/7/14 17:51, Lukáš Czerner wrote:
> On Mon, 14 Jul 2014, Rui Xiang wrote:
> 
>> Date: Mon, 14 Jul 2014 17:34:38 +0800
>> From: Rui Xiang <rui.xiang@huawei.com>
>> To: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org
>> Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>>     Li Zefan <lizefan@huawei.com>
>> Subject: testing result of loop-aio patchset on ext3
>>
>> Hi Dave,
>>
>> We export a container image file as a block device via loop device, but we
>> found it's very easy that the container rootfs gets corrupted due to power
>> loss.
>>
>> Your early version of loop-aio patchset said the patchset can make loop
>> mounted filesystems recoverable(lkml.org/lkml/2012/3/30/317), but we found
>> it doesn't help.
>>
>> Both the guest fs and host fs are ext3.
>>
>> The loop-aio patchset is from:
>> git://github.com/kleikamp/linux-shaggy.git aio_loop
>>
>> Steps:
>> 1. dd a 10G image, mkfs.ext3,
>>   # dd if=/dev/zero of=./raw_image bs=1M count=10000
>>   # echo y | mkfs.ext3 raw_image
>>
>> 2. losetup a loop device, mount at ./test_dir
>>   # losetup /dev/loop1 raw_image
>>   # mount /dev/loop1 ./test_dir
>>
>> 3. copy fs_mark into test_dir and run
>>   # ./fs_mark -d ./tmp/ -s 102400000 -n 80
>>
>> 4. during runing fs_mark, make systerm reboot indirectly.
>>   # echo b > /proc/sysrq-trigger
>>
>> After systerm booted up, sometimes fsck reported raw_image fs has been damaged.
>>
>> # fsck.ext3 -n raw_image
>> e2fsck 1.41.9 (22-Aug-2009)
>> Warning: skipping journal recovery because doing a read-only filesystem check.
>> raw_image contains a file system with errors, check forced.
>> Pass 1: Checking inodes, blocks, and sizes
>> Pass 2: Checking directory structure
>> Pass 3: Checking directory connectivity
>> Pass 4: Checking reference counts
>> Pass 5: Checking group summary information
>> Free blocks count wrong (2481348, counted=2480577).
>> Fix? no
>> Free inodes count wrong (640837, counted=640835).
>> Fix? no
>> raw_image: ********** WARNING: Filesystem still has errors **********
>> raw_image: 11/640848 files (0.0% non-contiguous), 78652/2560000 blocks
> 
> It's not damaged, this is expected result if you're using old
> e2fsprogs which still treats this as an error.
> 
> It's not an error because we only update superblock summary at
> unmount time so with unclean shutdown it's likely that it does not
> match the reality, but e2fsck can and will easily fix that for you.
> 
> Please try e2fsprogs v1.42.3 or newer.
> 

Hi Lukas,

I updated e2fsprogs to v1.42.3, and user the newer fsck.ext3 to check raw_image.
Exactly, the result seemed normal.

Then, I continue my previous test. And after testing 35 times, "fsck -n" reported image fs
had been damaged, too.

 # fsck.ext3 -n image1
e2fsck 1.42.3.wc1 (28-May-2012)
Warning: skipping journal recovery because doing a read-only filesystem check.
image1 has been mounted 36 times without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Inode 16407, i_size is 597447, should be 602112.  Fix? no
Inode 16407, i_blocks is 1176, should be 1184.  Fix? no
Inode 409941, i_blocks is 200208, should be 112.  Fix? no
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences:  -1506836 -1506843 -(1506859--1506860) -(1660941--1661964) -(1661966--1671167) -(1671688--1686473)
Fix? no
Free blocks count wrong for group #2 (31558, counted=31556).
Fix? no
Free blocks count wrong for group #43 (15871, counted=15867).
Fix? no
Free blocks count wrong (2204041, counted=2204035).
Fix? no
image1: ********** WARNING: Filesystem still has errors **********
image1: 13008/655360 files (0.3% non-contiguous), 417399/2621440 blocks

I backup the image to image_bk, and then mount the image to a dir, and cat all files in the image.
Steps:
# dd if=image1 of=image_bk
# mount image1 err_dir
# find -name '*' -exec cat > /dev/null {} \;

There are no issues during catting, and no err in dmesg too.

*But when I umount the image1 from err_dir, The fsck result didn't show any fs corruption info.

I mount image_bk to err_dir and umount it with no operation directly. The result is same to iamge1.

*So, is fs in the image as a block device via loop device damaged really, or does it have some others issues? 
Could you give me some opinions?


Thanks.


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: testing result of loop-aio patchset on ext3
  2014-07-16  3:54     ` Rui Xiang
  (?)
@ 2014-07-16  7:58     ` Lukáš Czerner
  2014-07-16  9:28       ` Rui Xiang
  -1 siblings, 1 reply; 11+ messages in thread
From: Lukáš Czerner @ 2014-07-16  7:58 UTC (permalink / raw)
  To: Rui Xiang
  Cc: Dave Kleikamp, linux-ext4, linux-fsdevel, linux-kernel, Li Zefan

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5513 bytes --]

On Wed, 16 Jul 2014, Rui Xiang wrote:

> Date: Wed, 16 Jul 2014 11:54:24 +0800
> From: Rui Xiang <rui.xiang@huawei.com>
> To: Lukáš Czerner <lczerner@redhat.com>
> Cc: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org,
>     linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>     Li Zefan <lizefan@huawei.com>
> Subject: Re: testing result of loop-aio patchset on ext3
> 
> On 2014/7/14 17:51, Lukáš Czerner wrote:
> > On Mon, 14 Jul 2014, Rui Xiang wrote:
> > 
> >> Date: Mon, 14 Jul 2014 17:34:38 +0800
> >> From: Rui Xiang <rui.xiang@huawei.com>
> >> To: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org
> >> Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
> >>     Li Zefan <lizefan@huawei.com>
> >> Subject: testing result of loop-aio patchset on ext3
> >>
> >> Hi Dave,
> >>
> >> We export a container image file as a block device via loop device, but we
> >> found it's very easy that the container rootfs gets corrupted due to power
> >> loss.
> >>
> >> Your early version of loop-aio patchset said the patchset can make loop
> >> mounted filesystems recoverable(lkml.org/lkml/2012/3/30/317), but we found
> >> it doesn't help.
> >>
> >> Both the guest fs and host fs are ext3.
> >>
> >> The loop-aio patchset is from:
> >> git://github.com/kleikamp/linux-shaggy.git aio_loop
> >>
> >> Steps:
> >> 1. dd a 10G image, mkfs.ext3,
> >>   # dd if=/dev/zero of=./raw_image bs=1M count=10000
> >>   # echo y | mkfs.ext3 raw_image
> >>
> >> 2. losetup a loop device, mount at ./test_dir
> >>   # losetup /dev/loop1 raw_image
> >>   # mount /dev/loop1 ./test_dir
> >>
> >> 3. copy fs_mark into test_dir and run
> >>   # ./fs_mark -d ./tmp/ -s 102400000 -n 80
> >>
> >> 4. during runing fs_mark, make systerm reboot indirectly.
> >>   # echo b > /proc/sysrq-trigger
> >>
> >> After systerm booted up, sometimes fsck reported raw_image fs has been damaged.
> >>
> >> # fsck.ext3 -n raw_image
> >> e2fsck 1.41.9 (22-Aug-2009)
> >> Warning: skipping journal recovery because doing a read-only filesystem check.
> >> raw_image contains a file system with errors, check forced.
> >> Pass 1: Checking inodes, blocks, and sizes
> >> Pass 2: Checking directory structure
> >> Pass 3: Checking directory connectivity
> >> Pass 4: Checking reference counts
> >> Pass 5: Checking group summary information
> >> Free blocks count wrong (2481348, counted=2480577).
> >> Fix? no
> >> Free inodes count wrong (640837, counted=640835).
> >> Fix? no
> >> raw_image: ********** WARNING: Filesystem still has errors **********
> >> raw_image: 11/640848 files (0.0% non-contiguous), 78652/2560000 blocks
> > 
> > It's not damaged, this is expected result if you're using old
> > e2fsprogs which still treats this as an error.
> > 
> > It's not an error because we only update superblock summary at
> > unmount time so with unclean shutdown it's likely that it does not
> > match the reality, but e2fsck can and will easily fix that for you.
> > 
> > Please try e2fsprogs v1.42.3 or newer.
> > 
> 
> Hi Lukas,
> 
> I updated e2fsprogs to v1.42.3, and user the newer fsck.ext3 to check raw_image.
> Exactly, the result seemed normal.

Now I can see that there are much more problems than before, that's
weird. Sorry for not making this clear, but for this kind of
reproducers please use the most recent e2fsprogs. Also , what is the
kernel version you're using in this test ?

Thanks!
-Lukas

> 
> Then, I continue my previous test. And after testing 35 times, "fsck -n" reported image fs
> had been damaged, too.
> 
>  # fsck.ext3 -n image1
> e2fsck 1.42.3.wc1 (28-May-2012)
> Warning: skipping journal recovery because doing a read-only filesystem check.
> image1 has been mounted 36 times without being checked, check forced.
> Pass 1: Checking inodes, blocks, and sizes
> Inode 16407, i_size is 597447, should be 602112.  Fix? no
> Inode 16407, i_blocks is 1176, should be 1184.  Fix? no
> Inode 409941, i_blocks is 200208, should be 112.  Fix? no
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> Block bitmap differences:  -1506836 -1506843 -(1506859--1506860) -(1660941--1661964) -(1661966--1671167) -(1671688--1686473)
> Fix? no
> Free blocks count wrong for group #2 (31558, counted=31556).
> Fix? no
> Free blocks count wrong for group #43 (15871, counted=15867).
> Fix? no
> Free blocks count wrong (2204041, counted=2204035).
> Fix? no
> image1: ********** WARNING: Filesystem still has errors **********
> image1: 13008/655360 files (0.3% non-contiguous), 417399/2621440 blocks
> 
> I backup the image to image_bk, and then mount the image to a dir, and cat all files in the image.
> Steps:
> # dd if=image1 of=image_bk
> # mount image1 err_dir
> # find -name '*' -exec cat > /dev/null {} \;
> 
> There are no issues during catting, and no err in dmesg too.
> 
> *But when I umount the image1 from err_dir, The fsck result didn't show any fs corruption info.
> 
> I mount image_bk to err_dir and umount it with no operation directly. The result is same to iamge1.
> 
> *So, is fs in the image as a block device via loop device damaged really, or does it have some others issues? 
> Could you give me some opinions?
> 
> 
> Thanks.
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: testing result of loop-aio patchset on ext3
  2014-07-16  7:58     ` Lukáš Czerner
@ 2014-07-16  9:28       ` Rui Xiang
  2014-07-18  9:10         ` Lukáš Czerner
  0 siblings, 1 reply; 11+ messages in thread
From: Rui Xiang @ 2014-07-16  9:28 UTC (permalink / raw)
  To: Lukáš Czerner
  Cc: Dave Kleikamp, linux-ext4, linux-fsdevel, linux-kernel, Li Zefan

On 2014/7/16 15:58, Lukáš Czerner wrote:
> On Wed, 16 Jul 2014, Rui Xiang wrote:
> 
>> Date: Wed, 16 Jul 2014 11:54:24 +0800
>> From: Rui Xiang <rui.xiang@huawei.com>
>> To: Lukáš Czerner <lczerner@redhat.com>
>> Cc: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org,
>>     linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>>     Li Zefan <lizefan@huawei.com>
>> Subject: Re: testing result of loop-aio patchset on ext3
>>
>> On 2014/7/14 17:51, Lukáš Czerner wrote:
>>> On Mon, 14 Jul 2014, Rui Xiang wrote:
>>>
>>>> Date: Mon, 14 Jul 2014 17:34:38 +0800
>>>> From: Rui Xiang <rui.xiang@huawei.com>
>>>> To: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org
>>>> Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>>>>     Li Zefan <lizefan@huawei.com>
>>>> Subject: testing result of loop-aio patchset on ext3
>>>>
>>>> Hi Dave,
>>>>
>>>> We export a container image file as a block device via loop device, but we
>>>> found it's very easy that the container rootfs gets corrupted due to power
>>>> loss.
>>>>
>>>> Your early version of loop-aio patchset said the patchset can make loop
>>>> mounted filesystems recoverable(lkml.org/lkml/2012/3/30/317), but we found
>>>> it doesn't help.
>>>>
>>>> Both the guest fs and host fs are ext3.
>>>>
>>>> The loop-aio patchset is from:
>>>> git://github.com/kleikamp/linux-shaggy.git aio_loop
>>>>
>>>> Steps:
>>>> 1. dd a 10G image, mkfs.ext3,
>>>>   # dd if=/dev/zero of=./raw_image bs=1M count=10000
>>>>   # echo y | mkfs.ext3 raw_image
>>>>
>>>> 2. losetup a loop device, mount at ./test_dir
>>>>   # losetup /dev/loop1 raw_image
>>>>   # mount /dev/loop1 ./test_dir
>>>>
>>>> 3. copy fs_mark into test_dir and run
>>>>   # ./fs_mark -d ./tmp/ -s 102400000 -n 80
>>>>
>>>> 4. during runing fs_mark, make systerm reboot indirectly.
>>>>   # echo b > /proc/sysrq-trigger
>>>>
>>>> After systerm booted up, sometimes fsck reported raw_image fs has been damaged.
>>>>
>>>> # fsck.ext3 -n raw_image
>>>> e2fsck 1.41.9 (22-Aug-2009)
>>>> Warning: skipping journal recovery because doing a read-only filesystem check.
>>>> raw_image contains a file system with errors, check forced.
>>>> Pass 1: Checking inodes, blocks, and sizes
>>>> Pass 2: Checking directory structure
>>>> Pass 3: Checking directory connectivity
>>>> Pass 4: Checking reference counts
>>>> Pass 5: Checking group summary information
>>>> Free blocks count wrong (2481348, counted=2480577).
>>>> Fix? no
>>>> Free inodes count wrong (640837, counted=640835).
>>>> Fix? no
>>>> raw_image: ********** WARNING: Filesystem still has errors **********
>>>> raw_image: 11/640848 files (0.0% non-contiguous), 78652/2560000 blocks
>>>
>>> It's not damaged, this is expected result if you're using old
>>> e2fsprogs which still treats this as an error.
>>>
>>> It's not an error because we only update superblock summary at
>>> unmount time so with unclean shutdown it's likely that it does not
>>> match the reality, but e2fsck can and will easily fix that for you.
>>>
>>> Please try e2fsprogs v1.42.3 or newer.
>>>
>>
>> Hi Lukas,
>>
>> I updated e2fsprogs to v1.42.3, and user the newer fsck.ext3 to check raw_image.
>> Exactly, the result seemed normal.
> 
> Now I can see that there are much more problems than before, that's
> weird. Sorry for not making this clear, but for this kind of
> reproducers please use the most recent e2fsprogs. Also , what is the
> kernel version you're using in this test ?
> 

I use the most recent e2fsprogs 1.42.11 to check, and the error info is same as
result fscked by v1.42.3. It seems that shouldn't be the reason.

Otherwise, the kernel version in this test is stable 3.4.


Thanks!

> Thanks!
> -Lukas
> 
>>
>> Then, I continue my previous test. And after testing 35 times, "fsck -n" reported image fs
>> had been damaged, too.
>>
>>  # fsck.ext3 -n image1
>> e2fsck 1.42.3.wc1 (28-May-2012)
>> Warning: skipping journal recovery because doing a read-only filesystem check.
>> image1 has been mounted 36 times without being checked, check forced.
>> Pass 1: Checking inodes, blocks, and sizes
>> Inode 16407, i_size is 597447, should be 602112.  Fix? no
>> Inode 16407, i_blocks is 1176, should be 1184.  Fix? no
>> Inode 409941, i_blocks is 200208, should be 112.  Fix? no
>> Pass 2: Checking directory structure
>> Pass 3: Checking directory connectivity
>> Pass 4: Checking reference counts
>> Pass 5: Checking group summary information
>> Block bitmap differences:  -1506836 -1506843 -(1506859--1506860) -(1660941--1661964) -(1661966--1671167) -(1671688--1686473)
>> Fix? no
>> Free blocks count wrong for group #2 (31558, counted=31556).
>> Fix? no
>> Free blocks count wrong for group #43 (15871, counted=15867).
>> Fix? no
>> Free blocks count wrong (2204041, counted=2204035).
>> Fix? no
>> image1: ********** WARNING: Filesystem still has errors **********
>> image1: 13008/655360 files (0.3% non-contiguous), 417399/2621440 blocks
>>
>> I backup the image to image_bk, and then mount the image to a dir, and cat all files in the image.
>> Steps:
>> # dd if=image1 of=image_bk
>> # mount image1 err_dir
>> # find -name '*' -exec cat > /dev/null {} \;
>>
>> There are no issues during catting, and no err in dmesg too.
>>
>> *But when I umount the image1 from err_dir, The fsck result didn't show any fs corruption info.
>>
>> I mount image_bk to err_dir and umount it with no operation directly. The result is same to iamge1.
>>
>> *So, is fs in the image as a block device via loop device damaged really, or does it have some others issues? 
>> Could you give me some opinions?
>>
>>
>> Thanks.
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: testing result of loop-aio patchset on ext3
  2014-07-16  9:28       ` Rui Xiang
@ 2014-07-18  9:10         ` Lukáš Czerner
  2014-07-21  2:34             ` Rui Xiang
  0 siblings, 1 reply; 11+ messages in thread
From: Lukáš Czerner @ 2014-07-18  9:10 UTC (permalink / raw)
  To: Rui Xiang
  Cc: Dave Kleikamp, linux-ext4, linux-fsdevel, linux-kernel, Li Zefan

[-- Attachment #1: Type: TEXT/PLAIN, Size: 6928 bytes --]

On Wed, 16 Jul 2014, Rui Xiang wrote:

> Date: Wed, 16 Jul 2014 17:28:10 +0800
> From: Rui Xiang <rui.xiang@huawei.com>
> To: Lukáš Czerner <lczerner@redhat.com>
> Cc: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org,
>     linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>     Li Zefan <lizefan@huawei.com>
> Subject: Re: testing result of loop-aio patchset on ext3
> 
> On 2014/7/16 15:58, Lukáš Czerner wrote:
> > On Wed, 16 Jul 2014, Rui Xiang wrote:
> > 
> >> Date: Wed, 16 Jul 2014 11:54:24 +0800
> >> From: Rui Xiang <rui.xiang@huawei.com>
> >> To: Lukáš Czerner <lczerner@redhat.com>
> >> Cc: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org,
> >>     linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
> >>     Li Zefan <lizefan@huawei.com>
> >> Subject: Re: testing result of loop-aio patchset on ext3
> >>
> >> On 2014/7/14 17:51, Lukáš Czerner wrote:
> >>> On Mon, 14 Jul 2014, Rui Xiang wrote:
> >>>
> >>>> Date: Mon, 14 Jul 2014 17:34:38 +0800
> >>>> From: Rui Xiang <rui.xiang@huawei.com>
> >>>> To: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org
> >>>> Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
> >>>>     Li Zefan <lizefan@huawei.com>
> >>>> Subject: testing result of loop-aio patchset on ext3
> >>>>
> >>>> Hi Dave,
> >>>>
> >>>> We export a container image file as a block device via loop device, but we
> >>>> found it's very easy that the container rootfs gets corrupted due to power
> >>>> loss.
> >>>>
> >>>> Your early version of loop-aio patchset said the patchset can make loop
> >>>> mounted filesystems recoverable(lkml.org/lkml/2012/3/30/317), but we found
> >>>> it doesn't help.
> >>>>
> >>>> Both the guest fs and host fs are ext3.
> >>>>
> >>>> The loop-aio patchset is from:
> >>>> git://github.com/kleikamp/linux-shaggy.git aio_loop
> >>>>
> >>>> Steps:
> >>>> 1. dd a 10G image, mkfs.ext3,
> >>>>   # dd if=/dev/zero of=./raw_image bs=1M count=10000
> >>>>   # echo y | mkfs.ext3 raw_image
> >>>>
> >>>> 2. losetup a loop device, mount at ./test_dir
> >>>>   # losetup /dev/loop1 raw_image
> >>>>   # mount /dev/loop1 ./test_dir
> >>>>
> >>>> 3. copy fs_mark into test_dir and run
> >>>>   # ./fs_mark -d ./tmp/ -s 102400000 -n 80
> >>>>
> >>>> 4. during runing fs_mark, make systerm reboot indirectly.
> >>>>   # echo b > /proc/sysrq-trigger
> >>>>
> >>>> After systerm booted up, sometimes fsck reported raw_image fs has been damaged.
> >>>>
> >>>> # fsck.ext3 -n raw_image
> >>>> e2fsck 1.41.9 (22-Aug-2009)
> >>>> Warning: skipping journal recovery because doing a read-only filesystem check.
> >>>> raw_image contains a file system with errors, check forced.
> >>>> Pass 1: Checking inodes, blocks, and sizes
> >>>> Pass 2: Checking directory structure
> >>>> Pass 3: Checking directory connectivity
> >>>> Pass 4: Checking reference counts
> >>>> Pass 5: Checking group summary information
> >>>> Free blocks count wrong (2481348, counted=2480577).
> >>>> Fix? no
> >>>> Free inodes count wrong (640837, counted=640835).
> >>>> Fix? no
> >>>> raw_image: ********** WARNING: Filesystem still has errors **********
> >>>> raw_image: 11/640848 files (0.0% non-contiguous), 78652/2560000 blocks
> >>>
> >>> It's not damaged, this is expected result if you're using old
> >>> e2fsprogs which still treats this as an error.
> >>>
> >>> It's not an error because we only update superblock summary at
> >>> unmount time so with unclean shutdown it's likely that it does not
> >>> match the reality, but e2fsck can and will easily fix that for you.
> >>>
> >>> Please try e2fsprogs v1.42.3 or newer.
> >>>
> >>
> >> Hi Lukas,
> >>
> >> I updated e2fsprogs to v1.42.3, and user the newer fsck.ext3 to check raw_image.
> >> Exactly, the result seemed normal.
> > 
> > Now I can see that there are much more problems than before, that's
> > weird. Sorry for not making this clear, but for this kind of
> > reproducers please use the most recent e2fsprogs. Also , what is the
> > kernel version you're using in this test ?
> > 
> 
> I use the most recent e2fsprogs 1.42.11 to check, and the error info is same as
> result fscked by v1.42.3. It seems that shouldn't be the reason.
> 
> Otherwise, the kernel version in this test is stable 3.4.

In that case, this is a problem somewhere else. I'll try to
reproduce and see what I can see.

I assume you're not able to reproduce this on a real device ?

Thanks!
-Lukas

> 
> 
> Thanks!
> 
> > Thanks!
> > -Lukas
> > 
> >>
> >> Then, I continue my previous test. And after testing 35 times, "fsck -n" reported image fs
> >> had been damaged, too.
> >>
> >>  # fsck.ext3 -n image1
> >> e2fsck 1.42.3.wc1 (28-May-2012)
> >> Warning: skipping journal recovery because doing a read-only filesystem check.
> >> image1 has been mounted 36 times without being checked, check forced.
> >> Pass 1: Checking inodes, blocks, and sizes
> >> Inode 16407, i_size is 597447, should be 602112.  Fix? no
> >> Inode 16407, i_blocks is 1176, should be 1184.  Fix? no
> >> Inode 409941, i_blocks is 200208, should be 112.  Fix? no
> >> Pass 2: Checking directory structure
> >> Pass 3: Checking directory connectivity
> >> Pass 4: Checking reference counts
> >> Pass 5: Checking group summary information
> >> Block bitmap differences:  -1506836 -1506843 -(1506859--1506860) -(1660941--1661964) -(1661966--1671167) -(1671688--1686473)
> >> Fix? no
> >> Free blocks count wrong for group #2 (31558, counted=31556).
> >> Fix? no
> >> Free blocks count wrong for group #43 (15871, counted=15867).
> >> Fix? no
> >> Free blocks count wrong (2204041, counted=2204035).
> >> Fix? no
> >> image1: ********** WARNING: Filesystem still has errors **********
> >> image1: 13008/655360 files (0.3% non-contiguous), 417399/2621440 blocks
> >>
> >> I backup the image to image_bk, and then mount the image to a dir, and cat all files in the image.
> >> Steps:
> >> # dd if=image1 of=image_bk
> >> # mount image1 err_dir
> >> # find -name '*' -exec cat > /dev/null {} \;
> >>
> >> There are no issues during catting, and no err in dmesg too.
> >>
> >> *But when I umount the image1 from err_dir, The fsck result didn't show any fs corruption info.
> >>
> >> I mount image_bk to err_dir and umount it with no operation directly. The result is same to iamge1.
> >>
> >> *So, is fs in the image as a block device via loop device damaged really, or does it have some others issues? 
> >> Could you give me some opinions?
> >>
> >>
> >> Thanks.
> >>
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: testing result of loop-aio patchset on ext3
  2014-07-18  9:10         ` Lukáš Czerner
@ 2014-07-21  2:34             ` Rui Xiang
  0 siblings, 0 replies; 11+ messages in thread
From: Rui Xiang @ 2014-07-21  2:34 UTC (permalink / raw)
  To: Lukáš Czerner
  Cc: Dave Kleikamp, linux-ext4, linux-fsdevel, linux-kernel, Li Zefan

On 2014/7/18 17:10, Lukáš Czerner wrote:
> On Wed, 16 Jul 2014, Rui Xiang wrote:
> 
>> Date: Wed, 16 Jul 2014 17:28:10 +0800
>> From: Rui Xiang <rui.xiang@huawei.com>
>> To: Lukáš Czerner <lczerner@redhat.com>
>> Cc: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org,
>>     linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>>     Li Zefan <lizefan@huawei.com>
>> Subject: Re: testing result of loop-aio patchset on ext3
>>
>> On 2014/7/16 15:58, Lukáš Czerner wrote:
>>> On Wed, 16 Jul 2014, Rui Xiang wrote:
>>>
>>>> Date: Wed, 16 Jul 2014 11:54:24 +0800
>>>> From: Rui Xiang <rui.xiang@huawei.com>
>>>> To: Lukáš Czerner <lczerner@redhat.com>
>>>> Cc: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org,
>>>>     linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>>>>     Li Zefan <lizefan@huawei.com>
>>>> Subject: Re: testing result of loop-aio patchset on ext3
>>>>
>>>> On 2014/7/14 17:51, Lukáš Czerner wrote:
>>>>> On Mon, 14 Jul 2014, Rui Xiang wrote:
>>>>>
>>>>>> Date: Mon, 14 Jul 2014 17:34:38 +0800
>>>>>> From: Rui Xiang <rui.xiang@huawei.com>
>>>>>> To: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org
>>>>>> Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>>>>>>     Li Zefan <lizefan@huawei.com>
>>>>>> Subject: testing result of loop-aio patchset on ext3
>>>>>>
>>>>>> Hi Dave,
>>>>>>
>>>>>> We export a container image file as a block device via loop device, but we
>>>>>> found it's very easy that the container rootfs gets corrupted due to power
>>>>>> loss.
>>>>>>
>>>>>> Your early version of loop-aio patchset said the patchset can make loop
>>>>>> mounted filesystems recoverable(lkml.org/lkml/2012/3/30/317), but we found
>>>>>> it doesn't help.
>>>>>>
>>>>>> Both the guest fs and host fs are ext3.
>>>>>>
>>>>>> The loop-aio patchset is from:
>>>>>> git://github.com/kleikamp/linux-shaggy.git aio_loop
>>>>>>
>>>>>> Steps:
>>>>>> 1. dd a 10G image, mkfs.ext3,
>>>>>>   # dd if=/dev/zero of=./raw_image bs=1M count=10000
>>>>>>   # echo y | mkfs.ext3 raw_image
>>>>>>
>>>>>> 2. losetup a loop device, mount at ./test_dir
>>>>>>   # losetup /dev/loop1 raw_image
>>>>>>   # mount /dev/loop1 ./test_dir
>>>>>>
>>>>>> 3. copy fs_mark into test_dir and run
>>>>>>   # ./fs_mark -d ./tmp/ -s 102400000 -n 80
>>>>>>
>>>>>> 4. during runing fs_mark, make systerm reboot indirectly.
>>>>>>   # echo b > /proc/sysrq-trigger
>>>>>>
>>>>>> After systerm booted up, sometimes fsck reported raw_image fs has been damaged.
>>>>>>
>>>>>> # fsck.ext3 -n raw_image
>>>>>> e2fsck 1.41.9 (22-Aug-2009)
>>>>>> Warning: skipping journal recovery because doing a read-only filesystem check.
>>>>>> raw_image contains a file system with errors, check forced.
>>>>>> Pass 1: Checking inodes, blocks, and sizes
>>>>>> Pass 2: Checking directory structure
>>>>>> Pass 3: Checking directory connectivity
>>>>>> Pass 4: Checking reference counts
>>>>>> Pass 5: Checking group summary information
>>>>>> Free blocks count wrong (2481348, counted=2480577).
>>>>>> Fix? no
>>>>>> Free inodes count wrong (640837, counted=640835).
>>>>>> Fix? no
>>>>>> raw_image: ********** WARNING: Filesystem still has errors **********
>>>>>> raw_image: 11/640848 files (0.0% non-contiguous), 78652/2560000 blocks
>>>>>
>>>>> It's not damaged, this is expected result if you're using old
>>>>> e2fsprogs which still treats this as an error.
>>>>>
>>>>> It's not an error because we only update superblock summary at
>>>>> unmount time so with unclean shutdown it's likely that it does not
>>>>> match the reality, but e2fsck can and will easily fix that for you.
>>>>>
>>>>> Please try e2fsprogs v1.42.3 or newer.
>>>>>
>>>>
>>>> Hi Lukas,
>>>>
>>>> I updated e2fsprogs to v1.42.3, and user the newer fsck.ext3 to check raw_image.
>>>> Exactly, the result seemed normal.
>>>
>>> Now I can see that there are much more problems than before, that's
>>> weird. Sorry for not making this clear, but for this kind of
>>> reproducers please use the most recent e2fsprogs. Also , what is the
>>> kernel version you're using in this test ?
>>>
>>
>> I use the most recent e2fsprogs 1.42.11 to check, and the error info is same as
>> result fscked by v1.42.3. It seems that shouldn't be the reason.
>>
>> Otherwise, the kernel version in this test is stable 3.4.
> 
> In that case, this is a problem somewhere else. I'll try to
> reproduce and see what I can see.
> 
> I assume you're not able to reproduce this on a real device ?
> 

Yes, it only exits on a loop device in my test.

Otherwise, There was another case in this test:

I fsck the err image with "-n", the result contains 7 issues.

# fsck.ext3 -n image1
Warning: skipping journal recovery because doing a read-only filesystem check.
image1 has been mounted 36 times without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
*Inode 16407, i_size is 643005, should be 647168.  Fix? no
*Inode 16407, i_blocks is 1264, should be 1272.  Fix? no
*Inode 409941, i_blocks is 200208, should be 16688.  Fix? no
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
*Block bitmap differences:  -1643951 +1644741 -(1646592--1646598) +(1648640--1648646) -(1657079--1658102) -(1658104--1659127) -(1659129--1660152) -(1660154--1661177) -(1661179--1662202) -(1662204--1663227) -(1663229--1664252) -(1664254--1665277) -(1665279--1666302) -(1666304--1667327) -(1667329--1668352) -(1668354--1669377) -(1669379--1670402) -(1670404--1671167) -(1671688--1671947) -(1671949--1672972) -(1672974--1673997) -(1673999--1675022) -(1675024--1676047) -(1676049--1677072) -(1677074--1678097) -(1678099--1679122) -(1679124--1680147) -(1680149--1680560)
Fix? no
*Free blocks count wrong for group #2 (31522, counted=31520).
Fix? no
*Free blocks count wrong for group #43 (15870, counted=15871).
Fix? no
*Free blocks count wrong for group #45 (398, counted=396).
Fix? no
*Free blocks count wrong (2203971, counted=2203968).
Fix? no
image1: ********** WARNING: Filesystem still has errors **********
image1: 13008/655360 files (0.3% non-contiguous), 417469/2621440 blocks

When I "fsck -y" the image, it seems that only fixes 1 issue.

# fsck.ext3 -y image1
image1: recovering journal
image1 has been mounted 36 times without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
*Free blocks count wrong (2203971, counted=2203968).
Fix<y>? yes
image1: ***** FILE SYSTEM WAS MODIFIED *****
image1: 13008/655360 files (0.3% non-contiguous), 417472/2621440 blocks

So, I assume journal is revocered before fs checking while doing
"fsck -y", and other issues are fixed during fs revovering journal,
 is that?

Thanks!

> Thanks!
> -Lukas
> 
>>
>>
>> Thanks!
>>
>>> Thanks!
>>> -Lukas
>>>
>>>>
>>>> Then, I continue my previous test. And after testing 35 times, "fsck -n" reported image fs
>>>> had been damaged, too.
>>>>
>>>>  # fsck.ext3 -n image1
>>>> e2fsck 1.42.3.wc1 (28-May-2012)
>>>> Warning: skipping journal recovery because doing a read-only filesystem check.
>>>> image1 has been mounted 36 times without being checked, check forced.
>>>> Pass 1: Checking inodes, blocks, and sizes
>>>> Inode 16407, i_size is 597447, should be 602112.  Fix? no
>>>> Inode 16407, i_blocks is 1176, should be 1184.  Fix? no
>>>> Inode 409941, i_blocks is 200208, should be 112.  Fix? no
>>>> Pass 2: Checking directory structure
>>>> Pass 3: Checking directory connectivity
>>>> Pass 4: Checking reference counts
>>>> Pass 5: Checking group summary information
>>>> Block bitmap differences:  -1506836 -1506843 -(1506859--1506860) -(1660941--1661964) -(1661966--1671167) -(1671688--1686473)
>>>> Fix? no
>>>> Free blocks count wrong for group #2 (31558, counted=31556).
>>>> Fix? no
>>>> Free blocks count wrong for group #43 (15871, counted=15867).
>>>> Fix? no
>>>> Free blocks count wrong (2204041, counted=2204035).
>>>> Fix? no
>>>> image1: ********** WARNING: Filesystem still has errors **********
>>>> image1: 13008/655360 files (0.3% non-contiguous), 417399/2621440 blocks
>>>>
>>>> I backup the image to image_bk, and then mount the image to a dir, and cat all files in the image.
>>>> Steps:
>>>> # dd if=image1 of=image_bk
>>>> # mount image1 err_dir
>>>> # find -name '*' -exec cat > /dev/null {} \;
>>>>
>>>> There are no issues during catting, and no err in dmesg too.
>>>>
>>>> *But when I umount the image1 from err_dir, The fsck result didn't show any fs corruption info.
>>>>
>>>> I mount image_bk to err_dir and umount it with no operation directly. The result is same to iamge1.
>>>>
>>>> *So, is fs in the image as a block device via loop device damaged really, or does it have some others issues? 
>>>> Could you give me some opinions?
>>>>
>>>>
>>>> Thanks.
>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: testing result of loop-aio patchset on ext3
@ 2014-07-21  2:34             ` Rui Xiang
  0 siblings, 0 replies; 11+ messages in thread
From: Rui Xiang @ 2014-07-21  2:34 UTC (permalink / raw)
  To: Lukáš Czerner
  Cc: Dave Kleikamp, linux-ext4, linux-fsdevel, linux-kernel, Li Zefan

On 2014/7/18 17:10, Lukáš Czerner wrote:
> On Wed, 16 Jul 2014, Rui Xiang wrote:
> 
>> Date: Wed, 16 Jul 2014 17:28:10 +0800
>> From: Rui Xiang <rui.xiang@huawei.com>
>> To: Lukáš Czerner <lczerner@redhat.com>
>> Cc: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org,
>>     linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>>     Li Zefan <lizefan@huawei.com>
>> Subject: Re: testing result of loop-aio patchset on ext3
>>
>> On 2014/7/16 15:58, Lukáš Czerner wrote:
>>> On Wed, 16 Jul 2014, Rui Xiang wrote:
>>>
>>>> Date: Wed, 16 Jul 2014 11:54:24 +0800
>>>> From: Rui Xiang <rui.xiang@huawei.com>
>>>> To: Lukáš Czerner <lczerner@redhat.com>
>>>> Cc: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org,
>>>>     linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>>>>     Li Zefan <lizefan@huawei.com>
>>>> Subject: Re: testing result of loop-aio patchset on ext3
>>>>
>>>> On 2014/7/14 17:51, Lukáš Czerner wrote:
>>>>> On Mon, 14 Jul 2014, Rui Xiang wrote:
>>>>>
>>>>>> Date: Mon, 14 Jul 2014 17:34:38 +0800
>>>>>> From: Rui Xiang <rui.xiang@huawei.com>
>>>>>> To: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org
>>>>>> Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>>>>>>     Li Zefan <lizefan@huawei.com>
>>>>>> Subject: testing result of loop-aio patchset on ext3
>>>>>>
>>>>>> Hi Dave,
>>>>>>
>>>>>> We export a container image file as a block device via loop device, but we
>>>>>> found it's very easy that the container rootfs gets corrupted due to power
>>>>>> loss.
>>>>>>
>>>>>> Your early version of loop-aio patchset said the patchset can make loop
>>>>>> mounted filesystems recoverable(lkml.org/lkml/2012/3/30/317), but we found
>>>>>> it doesn't help.
>>>>>>
>>>>>> Both the guest fs and host fs are ext3.
>>>>>>
>>>>>> The loop-aio patchset is from:
>>>>>> git://github.com/kleikamp/linux-shaggy.git aio_loop
>>>>>>
>>>>>> Steps:
>>>>>> 1. dd a 10G image, mkfs.ext3,
>>>>>>   # dd if=/dev/zero of=./raw_image bs=1M count=10000
>>>>>>   # echo y | mkfs.ext3 raw_image
>>>>>>
>>>>>> 2. losetup a loop device, mount at ./test_dir
>>>>>>   # losetup /dev/loop1 raw_image
>>>>>>   # mount /dev/loop1 ./test_dir
>>>>>>
>>>>>> 3. copy fs_mark into test_dir and run
>>>>>>   # ./fs_mark -d ./tmp/ -s 102400000 -n 80
>>>>>>
>>>>>> 4. during runing fs_mark, make systerm reboot indirectly.
>>>>>>   # echo b > /proc/sysrq-trigger
>>>>>>
>>>>>> After systerm booted up, sometimes fsck reported raw_image fs has been damaged.
>>>>>>
>>>>>> # fsck.ext3 -n raw_image
>>>>>> e2fsck 1.41.9 (22-Aug-2009)
>>>>>> Warning: skipping journal recovery because doing a read-only filesystem check.
>>>>>> raw_image contains a file system with errors, check forced.
>>>>>> Pass 1: Checking inodes, blocks, and sizes
>>>>>> Pass 2: Checking directory structure
>>>>>> Pass 3: Checking directory connectivity
>>>>>> Pass 4: Checking reference counts
>>>>>> Pass 5: Checking group summary information
>>>>>> Free blocks count wrong (2481348, counted=2480577).
>>>>>> Fix? no
>>>>>> Free inodes count wrong (640837, counted=640835).
>>>>>> Fix? no
>>>>>> raw_image: ********** WARNING: Filesystem still has errors **********
>>>>>> raw_image: 11/640848 files (0.0% non-contiguous), 78652/2560000 blocks
>>>>>
>>>>> It's not damaged, this is expected result if you're using old
>>>>> e2fsprogs which still treats this as an error.
>>>>>
>>>>> It's not an error because we only update superblock summary at
>>>>> unmount time so with unclean shutdown it's likely that it does not
>>>>> match the reality, but e2fsck can and will easily fix that for you.
>>>>>
>>>>> Please try e2fsprogs v1.42.3 or newer.
>>>>>
>>>>
>>>> Hi Lukas,
>>>>
>>>> I updated e2fsprogs to v1.42.3, and user the newer fsck.ext3 to check raw_image.
>>>> Exactly, the result seemed normal.
>>>
>>> Now I can see that there are much more problems than before, that's
>>> weird. Sorry for not making this clear, but for this kind of
>>> reproducers please use the most recent e2fsprogs. Also , what is the
>>> kernel version you're using in this test ?
>>>
>>
>> I use the most recent e2fsprogs 1.42.11 to check, and the error info is same as
>> result fscked by v1.42.3. It seems that shouldn't be the reason.
>>
>> Otherwise, the kernel version in this test is stable 3.4.
> 
> In that case, this is a problem somewhere else. I'll try to
> reproduce and see what I can see.
> 
> I assume you're not able to reproduce this on a real device ?
> 

Yes, it only exits on a loop device in my test.

Otherwise, There was another case in this test:

I fsck the err image with "-n", the result contains 7 issues.

# fsck.ext3 -n image1
Warning: skipping journal recovery because doing a read-only filesystem check.
image1 has been mounted 36 times without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
*Inode 16407, i_size is 643005, should be 647168.  Fix? no
*Inode 16407, i_blocks is 1264, should be 1272.  Fix? no
*Inode 409941, i_blocks is 200208, should be 16688.  Fix? no
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
*Block bitmap differences:  -1643951 +1644741 -(1646592--1646598) +(1648640--1648646) -(1657079--1658102) -(1658104--1659127) -(1659129--1660152) -(1660154--1661177) -(1661179--1662202) -(1662204--1663227) -(1663229--1664252) -(1664254--1665277) -(1665279--1666302) -(1666304--1667327) -(1667329--1668352) -(1668354--1669377) -(1669379--1670402) -(1670404--1671167) -(1671688--1671947) -(1671949--1672972) -(1672974--1673997) -(1673999--1675022) -(1675024--1676047) -(1676049--1677072) -(1677074--1678097) -(1678099--1679122) -(1679124--1680147) -(1680149--1680560)
Fix? no
*Free blocks count wrong for group #2 (31522, counted=31520).
Fix? no
*Free blocks count wrong for group #43 (15870, counted=15871).
Fix? no
*Free blocks count wrong for group #45 (398, counted=396).
Fix? no
*Free blocks count wrong (2203971, counted=2203968).
Fix? no
image1: ********** WARNING: Filesystem still has errors **********
image1: 13008/655360 files (0.3% non-contiguous), 417469/2621440 blocks

When I "fsck -y" the image, it seems that only fixes 1 issue.

# fsck.ext3 -y image1
image1: recovering journal
image1 has been mounted 36 times without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
*Free blocks count wrong (2203971, counted=2203968).
Fix<y>? yes
image1: ***** FILE SYSTEM WAS MODIFIED *****
image1: 13008/655360 files (0.3% non-contiguous), 417472/2621440 blocks

So, I assume journal is revocered before fs checking while doing
"fsck -y", and other issues are fixed during fs revovering journal,
 is that?

Thanks!

> Thanks!
> -Lukas
> 
>>
>>
>> Thanks!
>>
>>> Thanks!
>>> -Lukas
>>>
>>>>
>>>> Then, I continue my previous test. And after testing 35 times, "fsck -n" reported image fs
>>>> had been damaged, too.
>>>>
>>>>  # fsck.ext3 -n image1
>>>> e2fsck 1.42.3.wc1 (28-May-2012)
>>>> Warning: skipping journal recovery because doing a read-only filesystem check.
>>>> image1 has been mounted 36 times without being checked, check forced.
>>>> Pass 1: Checking inodes, blocks, and sizes
>>>> Inode 16407, i_size is 597447, should be 602112.  Fix? no
>>>> Inode 16407, i_blocks is 1176, should be 1184.  Fix? no
>>>> Inode 409941, i_blocks is 200208, should be 112.  Fix? no
>>>> Pass 2: Checking directory structure
>>>> Pass 3: Checking directory connectivity
>>>> Pass 4: Checking reference counts
>>>> Pass 5: Checking group summary information
>>>> Block bitmap differences:  -1506836 -1506843 -(1506859--1506860) -(1660941--1661964) -(1661966--1671167) -(1671688--1686473)
>>>> Fix? no
>>>> Free blocks count wrong for group #2 (31558, counted=31556).
>>>> Fix? no
>>>> Free blocks count wrong for group #43 (15871, counted=15867).
>>>> Fix? no
>>>> Free blocks count wrong (2204041, counted=2204035).
>>>> Fix? no
>>>> image1: ********** WARNING: Filesystem still has errors **********
>>>> image1: 13008/655360 files (0.3% non-contiguous), 417399/2621440 blocks
>>>>
>>>> I backup the image to image_bk, and then mount the image to a dir, and cat all files in the image.
>>>> Steps:
>>>> # dd if=image1 of=image_bk
>>>> # mount image1 err_dir
>>>> # find -name '*' -exec cat > /dev/null {} \;
>>>>
>>>> There are no issues during catting, and no err in dmesg too.
>>>>
>>>> *But when I umount the image1 from err_dir, The fsck result didn't show any fs corruption info.
>>>>
>>>> I mount image_bk to err_dir and umount it with no operation directly. The result is same to iamge1.
>>>>
>>>> *So, is fs in the image as a block device via loop device damaged really, or does it have some others issues? 
>>>> Could you give me some opinions?
>>>>
>>>>
>>>> Thanks.
>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: testing result of loop-aio patchset on ext3
  2014-07-21  2:34             ` Rui Xiang
  (?)
@ 2014-08-07  2:42             ` Rui Xiang
  -1 siblings, 0 replies; 11+ messages in thread
From: Rui Xiang @ 2014-08-07  2:42 UTC (permalink / raw)
  To: Lukáš Czerner
  Cc: Dave Kleikamp, linux-ext4, linux-fsdevel, linux-kernel, Li Zefan

On 2014/7/21 10:34, Rui Xiang wrote:
> On 2014/7/18 17:10, Lukáš Czerner wrote:
>> On Wed, 16 Jul 2014, Rui Xiang wrote:
>>
>>> Date: Wed, 16 Jul 2014 17:28:10 +0800
>>> From: Rui Xiang <rui.xiang@huawei.com>
>>> To: Lukáš Czerner <lczerner@redhat.com>
>>> Cc: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org,
>>>     linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>>>     Li Zefan <lizefan@huawei.com>
>>> Subject: Re: testing result of loop-aio patchset on ext3
>>>
>>> On 2014/7/16 15:58, Lukáš Czerner wrote:
>>>> On Wed, 16 Jul 2014, Rui Xiang wrote:
>>>>
>>>>> Date: Wed, 16 Jul 2014 11:54:24 +0800
>>>>> From: Rui Xiang <rui.xiang@huawei.com>
>>>>> To: Lukáš Czerner <lczerner@redhat.com>
>>>>> Cc: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org,
>>>>>     linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>>>>>     Li Zefan <lizefan@huawei.com>
>>>>> Subject: Re: testing result of loop-aio patchset on ext3
>>>>>
>>>>> On 2014/7/14 17:51, Lukáš Czerner wrote:
>>>>>> On Mon, 14 Jul 2014, Rui Xiang wrote:
>>>>>>
>>>>>>> Date: Mon, 14 Jul 2014 17:34:38 +0800
>>>>>>> From: Rui Xiang <rui.xiang@huawei.com>
>>>>>>> To: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org
>>>>>>> Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>>>>>>>     Li Zefan <lizefan@huawei.com>
>>>>>>> Subject: testing result of loop-aio patchset on ext3
>>>>>>>
>>>>>>> Hi Dave,
>>>>>>>
>>>>>>> We export a container image file as a block device via loop device, but we
>>>>>>> found it's very easy that the container rootfs gets corrupted due to power
>>>>>>> loss.
>>>>>>>
>>>>>>> Your early version of loop-aio patchset said the patchset can make loop
>>>>>>> mounted filesystems recoverable(lkml.org/lkml/2012/3/30/317), but we found
>>>>>>> it doesn't help.
>>>>>>>
>>>>>>> Both the guest fs and host fs are ext3.
>>>>>>>
>>>>>>> The loop-aio patchset is from:
>>>>>>> git://github.com/kleikamp/linux-shaggy.git aio_loop
>>>>>>>
>>>>>>> Steps:
>>>>>>> 1. dd a 10G image, mkfs.ext3,
>>>>>>>   # dd if=/dev/zero of=./raw_image bs=1M count=10000
>>>>>>>   # echo y | mkfs.ext3 raw_image
>>>>>>>
>>>>>>> 2. losetup a loop device, mount at ./test_dir
>>>>>>>   # losetup /dev/loop1 raw_image
>>>>>>>   # mount /dev/loop1 ./test_dir
>>>>>>>
>>>>>>> 3. copy fs_mark into test_dir and run
>>>>>>>   # ./fs_mark -d ./tmp/ -s 102400000 -n 80
>>>>>>>
>>>>>>> 4. during runing fs_mark, make systerm reboot indirectly.
>>>>>>>   # echo b > /proc/sysrq-trigger
>>>>>>>
>>>>>>> After systerm booted up, sometimes fsck reported raw_image fs has been damaged.
>>>>>>>
>>>>>>> # fsck.ext3 -n raw_image
>>>>>>> e2fsck 1.41.9 (22-Aug-2009)
>>>>>>> Warning: skipping journal recovery because doing a read-only filesystem check.
>>>>>>> raw_image contains a file system with errors, check forced.
>>>>>>> Pass 1: Checking inodes, blocks, and sizes
>>>>>>> Pass 2: Checking directory structure
>>>>>>> Pass 3: Checking directory connectivity
>>>>>>> Pass 4: Checking reference counts
>>>>>>> Pass 5: Checking group summary information
>>>>>>> Free blocks count wrong (2481348, counted=2480577).
>>>>>>> Fix? no
>>>>>>> Free inodes count wrong (640837, counted=640835).
>>>>>>> Fix? no
>>>>>>> raw_image: ********** WARNING: Filesystem still has errors **********
>>>>>>> raw_image: 11/640848 files (0.0% non-contiguous), 78652/2560000 blocks
>>>>>>
>>>>>> It's not damaged, this is expected result if you're using old
>>>>>> e2fsprogs which still treats this as an error.
>>>>>>
>>>>>> It's not an error because we only update superblock summary at
>>>>>> unmount time so with unclean shutdown it's likely that it does not
>>>>>> match the reality, but e2fsck can and will easily fix that for you.
>>>>>>
>>>>>> Please try e2fsprogs v1.42.3 or newer.
>>>>>>
>>>>>
>>>>> Hi Lukas,
>>>>>
>>>>> I updated e2fsprogs to v1.42.3, and user the newer fsck.ext3 to check raw_image.
>>>>> Exactly, the result seemed normal.
>>>>
>>>> Now I can see that there are much more problems than before, that's
>>>> weird. Sorry for not making this clear, but for this kind of
>>>> reproducers please use the most recent e2fsprogs. Also , what is the
>>>> kernel version you're using in this test ?
>>>>
>>>
>>> I use the most recent e2fsprogs 1.42.11 to check, and the error info is same as
>>> result fscked by v1.42.3. It seems that shouldn't be the reason.
>>>
>>> Otherwise, the kernel version in this test is stable 3.4.
>>
>> In that case, this is a problem somewhere else. I'll try to
>> reproduce and see what I can see.
>>
>> I assume you're not able to reproduce this on a real device ?
>>
> 
> Yes, it only exits on a loop device in my test.
> 
> Otherwise, There was another case in this test:
> 
> I fsck the err image with "-n", the result contains 7 issues.
> 
> # fsck.ext3 -n image1
> Warning: skipping journal recovery because doing a read-only filesystem check.
> image1 has been mounted 36 times without being checked, check forced.
> Pass 1: Checking inodes, blocks, and sizes
> *Inode 16407, i_size is 643005, should be 647168.  Fix? no
> *Inode 16407, i_blocks is 1264, should be 1272.  Fix? no
> *Inode 409941, i_blocks is 200208, should be 16688.  Fix? no
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> *Block bitmap differences:  -1643951 +1644741 -(1646592--1646598) +(1648640--1648646) -(1657079--1658102) -(1658104--1659127) -(1659129--1660152) -(1660154--1661177) -(1661179--1662202) -(1662204--1663227) -(1663229--1664252) -(1664254--1665277) -(1665279--1666302) -(1666304--1667327) -(1667329--1668352) -(1668354--1669377) -(1669379--1670402) -(1670404--1671167) -(1671688--1671947) -(1671949--1672972) -(1672974--1673997) -(1673999--1675022) -(1675024--1676047) -(1676049--1677072) -(1677074--1678097) -(1678099--1679122) -(1679124--1680147) -(1680149--1680560)
> Fix? no
> *Free blocks count wrong for group #2 (31522, counted=31520).
> Fix? no
> *Free blocks count wrong for group #43 (15870, counted=15871).
> Fix? no
> *Free blocks count wrong for group #45 (398, counted=396).
> Fix? no
> *Free blocks count wrong (2203971, counted=2203968).
> Fix? no
> image1: ********** WARNING: Filesystem still has errors **********
> image1: 13008/655360 files (0.3% non-contiguous), 417469/2621440 blocks
> 
> When I "fsck -y" the image, it seems that only fixes 1 issue.
> 
> # fsck.ext3 -y image1
> image1: recovering journal
> image1 has been mounted 36 times without being checked, check forced.
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> *Free blocks count wrong (2203971, counted=2203968).
> Fix<y>? yes
> image1: ***** FILE SYSTEM WAS MODIFIED *****
> image1: 13008/655360 files (0.3% non-contiguous), 417472/2621440 blocks
> 
> So, I assume journal is revocered before fs checking while doing
> "fsck -y", and other issues are fixed during fs revovering journal,
>  is that?
> 

Hi Lukas,




> Thanks!
> 
>> Thanks!
>> -Lukas
>>
>>>
>>>
>>> Thanks!
>>>
>>>> Thanks!
>>>> -Lukas
>>>>
>>>>>
>>>>> Then, I continue my previous test. And after testing 35 times, "fsck -n" reported image fs
>>>>> had been damaged, too.
>>>>>
>>>>>  # fsck.ext3 -n image1
>>>>> e2fsck 1.42.3.wc1 (28-May-2012)
>>>>> Warning: skipping journal recovery because doing a read-only filesystem check.
>>>>> image1 has been mounted 36 times without being checked, check forced.
>>>>> Pass 1: Checking inodes, blocks, and sizes
>>>>> Inode 16407, i_size is 597447, should be 602112.  Fix? no
>>>>> Inode 16407, i_blocks is 1176, should be 1184.  Fix? no
>>>>> Inode 409941, i_blocks is 200208, should be 112.  Fix? no
>>>>> Pass 2: Checking directory structure
>>>>> Pass 3: Checking directory connectivity
>>>>> Pass 4: Checking reference counts
>>>>> Pass 5: Checking group summary information
>>>>> Block bitmap differences:  -1506836 -1506843 -(1506859--1506860) -(1660941--1661964) -(1661966--1671167) -(1671688--1686473)
>>>>> Fix? no
>>>>> Free blocks count wrong for group #2 (31558, counted=31556).
>>>>> Fix? no
>>>>> Free blocks count wrong for group #43 (15871, counted=15867).
>>>>> Fix? no
>>>>> Free blocks count wrong (2204041, counted=2204035).
>>>>> Fix? no
>>>>> image1: ********** WARNING: Filesystem still has errors **********
>>>>> image1: 13008/655360 files (0.3% non-contiguous), 417399/2621440 blocks
>>>>>
>>>>> I backup the image to image_bk, and then mount the image to a dir, and cat all files in the image.
>>>>> Steps:
>>>>> # dd if=image1 of=image_bk
>>>>> # mount image1 err_dir
>>>>> # find -name '*' -exec cat > /dev/null {} \;
>>>>>
>>>>> There are no issues during catting, and no err in dmesg too.
>>>>>
>>>>> *But when I umount the image1 from err_dir, The fsck result didn't show any fs corruption info.
>>>>>
>>>>> I mount image_bk to err_dir and umount it with no operation directly. The result is same to iamge1.
>>>>>
>>>>> *So, is fs in the image as a block device via loop device damaged really, or does it have some others issues? 
>>>>> Could you give me some opinions?
>>>>>
>>>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: testing result of loop-aio patchset on ext3
  2014-07-21  2:34             ` Rui Xiang
  (?)
  (?)
@ 2014-08-07  3:09             ` Rui Xiang
  -1 siblings, 0 replies; 11+ messages in thread
From: Rui Xiang @ 2014-08-07  3:09 UTC (permalink / raw)
  To: Lukáš Czerner
  Cc: Dave Kleikamp, linux-ext4, linux-fsdevel, linux-kernel, Li Zefan

On 2014/7/21 10:34, Rui Xiang wrote:
> On 2014/7/18 17:10, Lukáš Czerner wrote:
>> On Wed, 16 Jul 2014, Rui Xiang wrote:
>>
>>> Date: Wed, 16 Jul 2014 17:28:10 +0800
>>> From: Rui Xiang <rui.xiang@huawei.com>
>>> To: Lukáš Czerner <lczerner@redhat.com>
>>> Cc: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org,
>>>     linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>>>     Li Zefan <lizefan@huawei.com>
>>> Subject: Re: testing result of loop-aio patchset on ext3
>>>
>>> On 2014/7/16 15:58, Lukáš Czerner wrote:
>>>> On Wed, 16 Jul 2014, Rui Xiang wrote:
>>>>
>>>>> Date: Wed, 16 Jul 2014 11:54:24 +0800
>>>>> From: Rui Xiang <rui.xiang@huawei.com>
>>>>> To: Lukáš Czerner <lczerner@redhat.com>
>>>>> Cc: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org,
>>>>>     linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>>>>>     Li Zefan <lizefan@huawei.com>
>>>>> Subject: Re: testing result of loop-aio patchset on ext3
>>>>>
>>>>> On 2014/7/14 17:51, Lukáš Czerner wrote:
>>>>>> On Mon, 14 Jul 2014, Rui Xiang wrote:
>>>>>>
>>>>>>> Date: Mon, 14 Jul 2014 17:34:38 +0800
>>>>>>> From: Rui Xiang <rui.xiang@huawei.com>
>>>>>>> To: Dave Kleikamp <dave.kleikamp@oracle.com>, linux-ext4@vger.kernel.org
>>>>>>> Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
>>>>>>>     Li Zefan <lizefan@huawei.com>
>>>>>>> Subject: testing result of loop-aio patchset on ext3
>>>>>>>
>>>>>>> Hi Dave,
>>>>>>>
>>>>>>> We export a container image file as a block device via loop device, but we
>>>>>>> found it's very easy that the container rootfs gets corrupted due to power
>>>>>>> loss.
>>>>>>>
>>>>>>> Your early version of loop-aio patchset said the patchset can make loop
>>>>>>> mounted filesystems recoverable(lkml.org/lkml/2012/3/30/317), but we found
>>>>>>> it doesn't help.
>>>>>>>
>>>>>>> Both the guest fs and host fs are ext3.
>>>>>>>
>>>>>>> The loop-aio patchset is from:
>>>>>>> git://github.com/kleikamp/linux-shaggy.git aio_loop
>>>>>>>
>>>>>>> Steps:
>>>>>>> 1. dd a 10G image, mkfs.ext3,
>>>>>>>   # dd if=/dev/zero of=./raw_image bs=1M count=10000
>>>>>>>   # echo y | mkfs.ext3 raw_image
>>>>>>>
>>>>>>> 2. losetup a loop device, mount at ./test_dir
>>>>>>>   # losetup /dev/loop1 raw_image
>>>>>>>   # mount /dev/loop1 ./test_dir
>>>>>>>
>>>>>>> 3. copy fs_mark into test_dir and run
>>>>>>>   # ./fs_mark -d ./tmp/ -s 102400000 -n 80
>>>>>>>
>>>>>>> 4. during runing fs_mark, make systerm reboot indirectly.
>>>>>>>   # echo b > /proc/sysrq-trigger
>>>>>>>
>>>>>>> After systerm booted up, sometimes fsck reported raw_image fs has been damaged.
>>>>>>>
>>>>>>> # fsck.ext3 -n raw_image
>>>>>>> e2fsck 1.41.9 (22-Aug-2009)
>>>>>>> Warning: skipping journal recovery because doing a read-only filesystem check.
>>>>>>> raw_image contains a file system with errors, check forced.
>>>>>>> Pass 1: Checking inodes, blocks, and sizes
>>>>>>> Pass 2: Checking directory structure
>>>>>>> Pass 3: Checking directory connectivity
>>>>>>> Pass 4: Checking reference counts
>>>>>>> Pass 5: Checking group summary information
>>>>>>> Free blocks count wrong (2481348, counted=2480577).
>>>>>>> Fix? no
>>>>>>> Free inodes count wrong (640837, counted=640835).
>>>>>>> Fix? no
>>>>>>> raw_image: ********** WARNING: Filesystem still has errors **********
>>>>>>> raw_image: 11/640848 files (0.0% non-contiguous), 78652/2560000 blocks
>>>>>>
>>>>>> It's not damaged, this is expected result if you're using old
>>>>>> e2fsprogs which still treats this as an error.
>>>>>>
>>>>>> It's not an error because we only update superblock summary at
>>>>>> unmount time so with unclean shutdown it's likely that it does not
>>>>>> match the reality, but e2fsck can and will easily fix that for you.
>>>>>>
>>>>>> Please try e2fsprogs v1.42.3 or newer.
>>>>>>
>>>>>
>>>>> Hi Lukas,
>>>>>
>>>>> I updated e2fsprogs to v1.42.3, and user the newer fsck.ext3 to check raw_image.
>>>>> Exactly, the result seemed normal.
>>>>
>>>> Now I can see that there are much more problems than before, that's
>>>> weird. Sorry for not making this clear, but for this kind of
>>>> reproducers please use the most recent e2fsprogs. Also , what is the
>>>> kernel version you're using in this test ?
>>>>
>>>
>>> I use the most recent e2fsprogs 1.42.11 to check, and the error info is same as
>>> result fscked by v1.42.3. It seems that shouldn't be the reason.
>>>
>>> Otherwise, the kernel version in this test is stable 3.4.
>>
>> In that case, this is a problem somewhere else. I'll try to
>> reproduce and see what I can see.
>>
>> I assume you're not able to reproduce this on a real device ?
>>
> 
> Yes, it only exits on a loop device in my test.
> 
> Otherwise, There was another case in this test:
> 
> I fsck the err image with "-n", the result contains 7 issues.
> 
> # fsck.ext3 -n image1
> Warning: skipping journal recovery because doing a read-only filesystem check.
> image1 has been mounted 36 times without being checked, check forced.
> Pass 1: Checking inodes, blocks, and sizes
> *Inode 16407, i_size is 643005, should be 647168.  Fix? no
> *Inode 16407, i_blocks is 1264, should be 1272.  Fix? no
> *Inode 409941, i_blocks is 200208, should be 16688.  Fix? no
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> *Block bitmap differences:  -1643951 +1644741 -(1646592--1646598) +(1648640--1648646) -(1657079--1658102) -(1658104--1659127) -(1659129--1660152) -(1660154--1661177) -(1661179--1662202) -(1662204--1663227) -(1663229--1664252) -(1664254--1665277) -(1665279--1666302) -(1666304--1667327) -(1667329--1668352) -(1668354--1669377) -(1669379--1670402) -(1670404--1671167) -(1671688--1671947) -(1671949--1672972) -(1672974--1673997) -(1673999--1675022) -(1675024--1676047) -(1676049--1677072) -(1677074--1678097) -(1678099--1679122) -(1679124--1680147) -(1680149--1680560)
> Fix? no
> *Free blocks count wrong for group #2 (31522, counted=31520).
> Fix? no
> *Free blocks count wrong for group #43 (15870, counted=15871).
> Fix? no
> *Free blocks count wrong for group #45 (398, counted=396).
> Fix? no
> *Free blocks count wrong (2203971, counted=2203968).
> Fix? no
> image1: ********** WARNING: Filesystem still has errors **********
> image1: 13008/655360 files (0.3% non-contiguous), 417469/2621440 blocks
> 
> When I "fsck -y" the image, it seems that only fixes 1 issue.
> 
> # fsck.ext3 -y image1
> image1: recovering journal
> image1 has been mounted 36 times without being checked, check forced.
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> *Free blocks count wrong (2203971, counted=2203968).
> Fix<y>? yes
> image1: ***** FILE SYSTEM WAS MODIFIED *****
> image1: 13008/655360 files (0.3% non-contiguous), 417472/2621440 blocks
> 
> So, I assume journal is revocered before fs checking while doing
> "fsck -y", and other issues are fixed during fs revovering journal,
>  is that?
> 

Hi Lukas,

Do you have some new opinions about this? 

Otherwise, I found the issue after recovering journal was always that free blocks
count was more than counted one during above test. 

> *Free blocks count wrong (2203971, counted=2203968).
> Fix<y>? yes


And was that fsck result acceptable to continue using the loop device, but not a 
damage for the filesysterm above the device?



Thanks!


> Thanks!
> 
>> Thanks!
>> -Lukas
>>
>>>
>>>
>>> Thanks!
>>>
>>>> Thanks!
>>>> -Lukas
>>>>
>>>>>
>>>>> Then, I continue my previous test. And after testing 35 times, "fsck -n" reported image fs
>>>>> had been damaged, too.
>>>>>
>>>>>  # fsck.ext3 -n image1
>>>>> e2fsck 1.42.3.wc1 (28-May-2012)
>>>>> Warning: skipping journal recovery because doing a read-only filesystem check.
>>>>> image1 has been mounted 36 times without being checked, check forced.
>>>>> Pass 1: Checking inodes, blocks, and sizes
>>>>> Inode 16407, i_size is 597447, should be 602112.  Fix? no
>>>>> Inode 16407, i_blocks is 1176, should be 1184.  Fix? no
>>>>> Inode 409941, i_blocks is 200208, should be 112.  Fix? no
>>>>> Pass 2: Checking directory structure
>>>>> Pass 3: Checking directory connectivity
>>>>> Pass 4: Checking reference counts
>>>>> Pass 5: Checking group summary information
>>>>> Block bitmap differences:  -1506836 -1506843 -(1506859--1506860) -(1660941--1661964) -(1661966--1671167) -(1671688--1686473)
>>>>> Fix? no
>>>>> Free blocks count wrong for group #2 (31558, counted=31556).
>>>>> Fix? no
>>>>> Free blocks count wrong for group #43 (15871, counted=15867).
>>>>> Fix? no
>>>>> Free blocks count wrong (2204041, counted=2204035).
>>>>> Fix? no
>>>>> image1: ********** WARNING: Filesystem still has errors **********
>>>>> image1: 13008/655360 files (0.3% non-contiguous), 417399/2621440 blocks
>>>>>
>>>>> I backup the image to image_bk, and then mount the image to a dir, and cat all files in the image.
>>>>> Steps:
>>>>> # dd if=image1 of=image_bk
>>>>> # mount image1 err_dir
>>>>> # find -name '*' -exec cat > /dev/null {} \;
>>>>>
>>>>> There are no issues during catting, and no err in dmesg too.
>>>>>
>>>>> *But when I umount the image1 from err_dir, The fsck result didn't show any fs corruption info.
>>>>>
>>>>> I mount image_bk to err_dir and umount it with no operation directly. The result is same to iamge1.
>>>>>
>>>>> *So, is fs in the image as a block device via loop device damaged really, or does it have some others issues? 
>>>>> Could you give me some opinions?
>>>>>
>>>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-08-07  3:09 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-14  9:34 testing result of loop-aio patchset on ext3 Rui Xiang
2014-07-14  9:51 ` Lukáš Czerner
2014-07-16  3:54   ` Rui Xiang
2014-07-16  3:54     ` Rui Xiang
2014-07-16  7:58     ` Lukáš Czerner
2014-07-16  9:28       ` Rui Xiang
2014-07-18  9:10         ` Lukáš Czerner
2014-07-21  2:34           ` Rui Xiang
2014-07-21  2:34             ` Rui Xiang
2014-08-07  2:42             ` Rui Xiang
2014-08-07  3:09             ` Rui Xiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.