regressions.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* Re: RO mount of ext4 filesystem causes writes
       [not found] <ZIauBR7YiV3rVAHL@glitch>
@ 2023-06-12  6:20 ` Bagas Sanjaya
  2023-06-13  4:48   ` Sean Greenslade
  2023-06-23  1:06   ` Bagas Sanjaya
  0 siblings, 2 replies; 9+ messages in thread
From: Bagas Sanjaya @ 2023-06-12  6:20 UTC (permalink / raw)
  To: Sean Greenslade, linux-ext4, Ye Bin, Theodore Ts'o
  Cc: Linux Kernel Mailing List, Linux Regressions

[-- Attachment #1: Type: text/plain, Size: 1677 bytes --]

On Sun, Jun 11, 2023 at 10:32:53PM -0700, Sean Greenslade wrote:
> Hello, folks.
> 
> I noticed a change in behavior of ext4 in recent kernels. I make use of
> several luks loopback images formatted as ext4 that I mount read-only
> most of the time. I use rsync to synchronize the backing images between
> machines. In the past, mouning the images as read-only would not touch
> the backing image contents at all, but recently this changed. Every
> mount, even ones that are RO from the start, will cause some small
> writes to the backing image and thus force rsync to scan the whole file.
> 
> I confirmed that the issue is still present on v6.4.rc6, so I performed
> a bisect and landed on the following commit:
> 
> > eee00237fa5ec8f704f7323b54e48cc34e2d9168 is the first bad commit
> > commit eee00237fa5ec8f704f7323b54e48cc34e2d9168
> > Author: Ye Bin <yebin10@huawei.com>
> > Date:   Tue Mar 7 14:17:02 2023 +0800
> > 
> >     ext4: commit super block if fs record error when journal record without error
> 
> That certainly looks like a likely cause of my issue, but I'm not
> familiar enough with the ext4 code to diagnose any further. Please let
> me know if you need any additional information, or if you would like me
> to test anything.
> 

Can you show dmesg when regression happens?

Ye: It looks like this regression is caused by your commit. Would you like
to take a look on it?

Anyway, thanks for the bug report. I'm adding it to regzbot:

#regzbot ^introduced: eee00237fa5ec8
#regzbot title: commit super block writes even in read-only filesystems

Thanks.

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RO mount of ext4 filesystem causes writes
  2023-06-12  6:20 ` RO mount of ext4 filesystem causes writes Bagas Sanjaya
@ 2023-06-13  4:48   ` Sean Greenslade
  2023-06-23  1:06   ` Bagas Sanjaya
  1 sibling, 0 replies; 9+ messages in thread
From: Sean Greenslade @ 2023-06-13  4:48 UTC (permalink / raw)
  To: Bagas Sanjaya
  Cc: linux-ext4, Ye Bin, Theodore Ts'o, Linux Kernel Mailing List,
	Linux Regressions

On Mon, Jun 12, 2023 at 01:20:47PM +0700, Bagas Sanjaya wrote:
> On Sun, Jun 11, 2023 at 10:32:53PM -0700, Sean Greenslade wrote:
> > Hello, folks.
> > 
> > I noticed a change in behavior of ext4 in recent kernels. I make use of
> > several luks loopback images formatted as ext4 that I mount read-only
> > most of the time. I use rsync to synchronize the backing images between
> > machines. In the past, mouning the images as read-only would not touch
> > the backing image contents at all, but recently this changed. Every
> > mount, even ones that are RO from the start, will cause some small
> > writes to the backing image and thus force rsync to scan the whole file.
> > 
> > I confirmed that the issue is still present on v6.4.rc6, so I performed
> > a bisect and landed on the following commit:
> > 
> > > eee00237fa5ec8f704f7323b54e48cc34e2d9168 is the first bad commit
> > > commit eee00237fa5ec8f704f7323b54e48cc34e2d9168
> > > Author: Ye Bin <yebin10@huawei.com>
> > > Date:   Tue Mar 7 14:17:02 2023 +0800
> > > 
> > >     ext4: commit super block if fs record error when journal record without error
> > 
> > That certainly looks like a likely cause of my issue, but I'm not
> > familiar enough with the ext4 code to diagnose any further. Please let
> > me know if you need any additional information, or if you would like me
> > to test anything.
> > 
> 
> Can you show dmesg when regression happens?
> 
> Ye: It looks like this regression is caused by your commit. Would you like
> to take a look on it?
> 
> Anyway, thanks for the bug report. I'm adding it to regzbot:
> 
> #regzbot ^introduced: eee00237fa5ec8
> #regzbot title: commit super block writes even in read-only filesystems

I did a fresh boot on the 6.4 rc6 kernel and did an RO mount and
unmount. Here's the full dmesg output from that:

[   48.955896] loop0: detected capacity change from 0 to 4194304
[   48.965526] Key type trusted registered
[   48.973640] Key type encrypted registered
[   49.032404] EXT4-fs (dm-0): mounted filesystem 4e824972-4523-407e-b0da-3229a71b68d8 ro with ordered data mode. Quota mode: none.
[   61.180755] EXT4-fs (dm-0): unmounting filesystem 4e824972-4523-407e-b0da-3229a71b68d8.
[   61.236958] dm-0: detected capacity change from 4190208 to 0

Thanks,

--Sean


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RO mount of ext4 filesystem causes writes
  2023-06-12  6:20 ` RO mount of ext4 filesystem causes writes Bagas Sanjaya
  2023-06-13  4:48   ` Sean Greenslade
@ 2023-06-23  1:06   ` Bagas Sanjaya
  2023-06-23  4:46     ` Theodore Ts'o
  1 sibling, 1 reply; 9+ messages in thread
From: Bagas Sanjaya @ 2023-06-23  1:06 UTC (permalink / raw)
  To: Sean Greenslade, linux-ext4, Ye Bin, Theodore Ts'o,
	Thorsten Leemhuis
  Cc: Linux Kernel Mailing List, Linux Regressions

[-- Attachment #1: Type: text/plain, Size: 1927 bytes --]

On Mon, Jun 12, 2023 at 01:20:47PM +0700, Bagas Sanjaya wrote:
> On Sun, Jun 11, 2023 at 10:32:53PM -0700, Sean Greenslade wrote:
> > Hello, folks.
> > 
> > I noticed a change in behavior of ext4 in recent kernels. I make use of
> > several luks loopback images formatted as ext4 that I mount read-only
> > most of the time. I use rsync to synchronize the backing images between
> > machines. In the past, mouning the images as read-only would not touch
> > the backing image contents at all, but recently this changed. Every
> > mount, even ones that are RO from the start, will cause some small
> > writes to the backing image and thus force rsync to scan the whole file.
> > 
> > I confirmed that the issue is still present on v6.4.rc6, so I performed
> > a bisect and landed on the following commit:
> > 
> > > eee00237fa5ec8f704f7323b54e48cc34e2d9168 is the first bad commit
> > > commit eee00237fa5ec8f704f7323b54e48cc34e2d9168
> > > Author: Ye Bin <yebin10@huawei.com>
> > > Date:   Tue Mar 7 14:17:02 2023 +0800
> > > 
> > >     ext4: commit super block if fs record error when journal record without error
> > 
> > That certainly looks like a likely cause of my issue, but I'm not
> > familiar enough with the ext4 code to diagnose any further. Please let
> > me know if you need any additional information, or if you would like me
> > to test anything.
> > 
> 
> Can you show dmesg when regression happens?
> 
> Ye: It looks like this regression is caused by your commit. Would you like
> to take a look on it?
> 
> Anyway, thanks for the bug report. I'm adding it to regzbot:
> 
> #regzbot ^introduced: eee00237fa5ec8
> #regzbot title: commit super block writes even in read-only filesystems
> 

Hi Thorsten,

No reply so far from the culprit author (Ye Bin) nor from Ted. Can
you help in this case?

Thanks.

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RO mount of ext4 filesystem causes writes
  2023-06-23  1:06   ` Bagas Sanjaya
@ 2023-06-23  4:46     ` Theodore Ts'o
  2023-06-23  6:18       ` Sean Greenslade
  0 siblings, 1 reply; 9+ messages in thread
From: Theodore Ts'o @ 2023-06-23  4:46 UTC (permalink / raw)
  To: Bagas Sanjaya
  Cc: Sean Greenslade, linux-ext4, Ye Bin, Thorsten Leemhuis,
	Linux Kernel Mailing List, Linux Regressions

On Fri, Jun 23, 2023 at 08:06:02AM +0700, Bagas Sanjaya wrote:
> 
> No reply so far from the culprit author (Ye Bin) nor from Ted. Can
> you help in this case?

There's been no reply because I haven't been able to replicate it, and
I didn't have the time do enough work to convince myself the report
was bogus.  At this point, I have spent time trying to reproduce it,
and I've had no luck.

So, unless you can give me a simple set of reproduction instructions,
I'm going to have to treat this report is invalid.

Regards,

						- Ted

Note: this test was done using kvm-xfstests which can be found
https://github.com/tytso/xfstests-bld using the install-kconfig and
the kbuild script that can also be found in this report.  So if you
want to play along from home, feel free.  :-)


root@kvm-xfstests:~# mkfs.ext4 /dev/vdc
mke2fs 1.47.0 (5-Feb-2023)
Discarding device blocks: done                            
Creating filesystem with 1310720 4k blocks and 327680 inodes
Filesystem UUID: fe434060-6731-4b40-a94a-3a8517df0660
Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376, 294912, 819200, 884736

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done 

root@kvm-xfstests:~# md5sum /dev/vdc
fd38f9f8476ad63a744d179846ee7e18  /dev/vdc
root@kvm-xfstests:~# mount -o ro /dev/vdc /mnt
[  472.893614] EXT4-fs (vdc): orphan cleanup on readonly fs
[  472.894022] EXT4-fs (vdc): mounted filesystem fe434060-6731-4b40-a94a-3a8517df0660 ro with ordered data mode. Quota mode: none.
root@kvm-xfstests:~# umount /mnt
[  475.698053] EXT4-fs (vdc): unmounting filesystem fe434060-6731-4b40-a94a-3a8517df0660.
root@kvm-xfstests:~# md5sum /dev/vdc
fd38f9f8476ad63a744d179846ee7e18  /dev/vdc

Hmm.... OK, let's try it with LUKS, even though that *really*
shouldn't make a difference.  The cryptsetup lukeFormat and mkfs.ext4
steps are skipped here.  Also, note that I had to manually edit the
.config file to enable CONFIG_DM_CRYPT, since I dm_crypt is used by
xfstests, so my install-kconfig script doesn't enable CONFIG_DM_CRYPT.


root@kvm-xfstests:~# uname -a
Linux kvm-xfstests 6.4.0-rc6-xfstests-lockdep #200 SMP PREEMPT_DYNAMIC Fri Jun 23 00:33:39 EDT 2023 x86_64 GNU/Linux

root@kvm-xfstests:~# md5sum /dev/vdc
28b75cc094e1e2a62ac25a730fc1dfee  /dev/vdc
root@kvm-xfstests:~# cryptsetup luksOpen /dev/vdc test
Enter passphrase for /dev/vdc: 
root@kvm-xfstests:~# mount -o ro /dev/mapper/test /mnt
[  812.073771] EXT4-fs (dm-0): orphan cleanup on readonly fs
[  812.074306] EXT4-fs (dm-0): mounted filesystem ac3f76f1-da0a-426e-85b2-08526afb2224 ro with ordered data mode. Quota mode: none.
root@kvm-xfstests:~# umount /mnt
[  814.383016] EXT4-fs (dm-0): unmounting filesystem ac3f76f1-da0a-426e-85b2-08526afb2224.
root@kvm-xfstests:~# cryptsetup luksClose /dev/mapper/test
[  830.001992] dm-0: detected capacity change from 10452992 to 0
root@kvm-xfstests:~# md5sum /dev/vdc
28b75cc094e1e2a62ac25a730fc1dfee  /dev/vdc



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RO mount of ext4 filesystem causes writes
  2023-06-23  4:46     ` Theodore Ts'o
@ 2023-06-23  6:18       ` Sean Greenslade
  2023-06-23 14:34         ` Theodore Ts'o
  0 siblings, 1 reply; 9+ messages in thread
From: Sean Greenslade @ 2023-06-23  6:18 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Bagas Sanjaya, linux-ext4, Ye Bin, Thorsten Leemhuis,
	Linux Kernel Mailing List, Linux Regressions

On Fri, Jun 23, 2023 at 12:46:11AM -0400, Theodore Ts'o wrote:
> On Fri, Jun 23, 2023 at 08:06:02AM +0700, Bagas Sanjaya wrote:
> > 
> > No reply so far from the culprit author (Ye Bin) nor from Ted. Can
> > you help in this case?
> 
> There's been no reply because I haven't been able to replicate it, and
> I didn't have the time do enough work to convince myself the report
> was bogus.  At this point, I have spent time trying to reproduce it,
> and I've had no luck.
> 
> So, unless you can give me a simple set of reproduction instructions,
> I'm going to have to treat this report is invalid.
> 
> Regards,
> 
> 						- Ted
> 
> Note: this test was done using kvm-xfstests which can be found
> https://github.com/tytso/xfstests-bld using the install-kconfig and
> the kbuild script that can also be found in this report.  So if you
> want to play along from home, feel free.  :-)
> 
> 
> root@kvm-xfstests:~# mkfs.ext4 /dev/vdc
> mke2fs 1.47.0 (5-Feb-2023)
> Discarding device blocks: done                            
> Creating filesystem with 1310720 4k blocks and 327680 inodes
> Filesystem UUID: fe434060-6731-4b40-a94a-3a8517df0660
> Superblock backups stored on blocks: 
>         32768, 98304, 163840, 229376, 294912, 819200, 884736
> 
> Allocating group tables: done                            
> Writing inode tables: done                            
> Creating journal (16384 blocks): done
> Writing superblocks and filesystem accounting information: done 
> 
> root@kvm-xfstests:~# md5sum /dev/vdc
> fd38f9f8476ad63a744d179846ee7e18  /dev/vdc
> root@kvm-xfstests:~# mount -o ro /dev/vdc /mnt
> [  472.893614] EXT4-fs (vdc): orphan cleanup on readonly fs
> [  472.894022] EXT4-fs (vdc): mounted filesystem fe434060-6731-4b40-a94a-3a8517df0660 ro with ordered data mode. Quota mode: none.
> root@kvm-xfstests:~# umount /mnt
> [  475.698053] EXT4-fs (vdc): unmounting filesystem fe434060-6731-4b40-a94a-3a8517df0660.
> root@kvm-xfstests:~# md5sum /dev/vdc
> fd38f9f8476ad63a744d179846ee7e18  /dev/vdc
> 
> Hmm.... OK, let's try it with LUKS, even though that *really*
> shouldn't make a difference.  The cryptsetup lukeFormat and mkfs.ext4
> steps are skipped here.  Also, note that I had to manually edit the
> .config file to enable CONFIG_DM_CRYPT, since I dm_crypt is used by
> xfstests, so my install-kconfig script doesn't enable CONFIG_DM_CRYPT.
> 
> 
> root@kvm-xfstests:~# uname -a
> Linux kvm-xfstests 6.4.0-rc6-xfstests-lockdep #200 SMP PREEMPT_DYNAMIC Fri Jun 23 00:33:39 EDT 2023 x86_64 GNU/Linux
> 
> root@kvm-xfstests:~# md5sum /dev/vdc
> 28b75cc094e1e2a62ac25a730fc1dfee  /dev/vdc
> root@kvm-xfstests:~# cryptsetup luksOpen /dev/vdc test
> Enter passphrase for /dev/vdc: 
> root@kvm-xfstests:~# mount -o ro /dev/mapper/test /mnt
> [  812.073771] EXT4-fs (dm-0): orphan cleanup on readonly fs
> [  812.074306] EXT4-fs (dm-0): mounted filesystem ac3f76f1-da0a-426e-85b2-08526afb2224 ro with ordered data mode. Quota mode: none.
> root@kvm-xfstests:~# umount /mnt
> [  814.383016] EXT4-fs (dm-0): unmounting filesystem ac3f76f1-da0a-426e-85b2-08526afb2224.
> root@kvm-xfstests:~# cryptsetup luksClose /dev/mapper/test
> [  830.001992] dm-0: detected capacity change from 10452992 to 0
> root@kvm-xfstests:~# md5sum /dev/vdc
> 28b75cc094e1e2a62ac25a730fc1dfee  /dev/vdc

Hi, Ted. Thanks for taking the time to look into this.

I perhaps should have been more explicit in my report. The issue is not
that the image is any different after the mount; indeed, the md5sums are
identical before and after on my machine as well. The issue is that
something is issuing writes to the backing image, which bumps the mtime
of the backing image. When handling the images with rsync, a difference
in mtime causes the whole image to need to be read.

See this flow:

$ stat image.block | grep Modify
Modify: 2023-06-22 23:00:15.211379972 -0700

# cryptsetup luksOpen image.block t2 && mount -o ro /dev/mapper/t2 /mnt/t2
# umount /mnt/t2 && cryptsetup luksClose t2

$ stat image.block | grep Modify
Modify: 2023-06-22 23:03:56.141139649 -0700


The regression is the mtime bumping behavior, which doesn't happen prior
to the eee00237 commit.

Thanks,

--Sean


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RO mount of ext4 filesystem causes writes
  2023-06-23  6:18       ` Sean Greenslade
@ 2023-06-23 14:34         ` Theodore Ts'o
  2023-06-23 15:38           ` Ritesh Harjani
  2023-06-23 16:53           ` Sean Greenslade
  0 siblings, 2 replies; 9+ messages in thread
From: Theodore Ts'o @ 2023-06-23 14:34 UTC (permalink / raw)
  To: Sean Greenslade
  Cc: Bagas Sanjaya, linux-ext4, Ye Bin, Thorsten Leemhuis,
	Linux Kernel Mailing List, Linux Regressions

On Thu, Jun 22, 2023 at 11:18:04PM -0700, Sean Greenslade wrote:
> I perhaps should have been more explicit in my report. The issue is not
> that the image is any different after the mount; indeed, the md5sums are
> identical before and after on my machine as well. The issue is that
> something is issuing writes to the backing image, which bumps the mtime
> of the backing image. When handling the images with rsync, a difference
> in mtime causes the whole image to need to be read.

Ah, yes, your initial report said "small writes", but it didn't
specify whether the issue was that writes were modifying the image, or
just simply touching the mtime field of the backing file.  I assume
these must be largish fs images, since it must have made the increased
rsync time noticeable?

This appears to fix the problem for me, given the clarified
reproduction information.  Could you please try it on your end?

	     		   	     	    - Ted

From 6bb438fa0aac4c08acd626d408cb6d4b745df7fd Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <tytso@mit.edu>
Date: Fri, 23 Jun 2023 10:18:51 -0400
Subject: [PATCH] ext4: avoid updating the superblock on a r/o mount if not
 needed

This was noticed by a user who noticied that the mtime of a file
backing a loopback device was getting bumped when the loopback device
is mounted read/only.  Note: This doesn't show up when doing a
loopback mount of a file directly, via "mount -o ro /tmp/foo.img
/mnt", since the loop device is set read-only when mount automatically
creates loop device.  However, this is noticeable for a LUKS loop
device like this:

% cryptsetup luksOpen /tmp/foo.img test
% mount -o ro /dev/loop0 /mnt ; umount /mnt

or, if LUKS is not in use, if the user manually creates the loop
device like this:

% losetup /dev/loop0 /tmp/foo.img
% mount -o ro /dev/loop0 /mnt ; umount /mnt

The modified mtime causes rsync to do a rolling checksum scan of the
file on the local and remote side, incrementally increasing the time
to rsync the not-modified-but-touched image file.

Fixes: eee00237fa5e ("ext4: commit super block if fs record error when journal record without error")
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/ZIauBR7YiV3rVAHL@glitch
Reported-by: Sean Greenslade <sean@seangreenslade.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
---
 fs/ext4/super.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index b3819e70093e..c638b0db3b2b 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5997,19 +5997,27 @@ static int ext4_load_journal(struct super_block *sb,
 		err = jbd2_journal_wipe(journal, !really_read_only);
 	if (!err) {
 		char *save = kmalloc(EXT4_S_ERR_LEN, GFP_KERNEL);
+		__le16 orig_state;
+		bool changed = false;
 
 		if (save)
 			memcpy(save, ((char *) es) +
 			       EXT4_S_ERR_START, EXT4_S_ERR_LEN);
 		err = jbd2_journal_load(journal);
-		if (save)
+		if (save && memcmp(((char *) es) + EXT4_S_ERR_START,
+				   save, EXT4_S_ERR_LEN)) {
 			memcpy(((char *) es) + EXT4_S_ERR_START,
 			       save, EXT4_S_ERR_LEN);
+			changed = true;
+		}
 		kfree(save);
+		orig_state = es->s_state;
 		es->s_state |= cpu_to_le16(EXT4_SB(sb)->s_mount_state &
 					   EXT4_ERROR_FS);
+		if (orig_state != es->s_state)
+			changed = true;
 		/* Write out restored error information to the superblock */
-		if (!bdev_read_only(sb->s_bdev)) {
+		if (changed && !really_read_only) {
 			int err2;
 			err2 = ext4_commit_super(sb);
 			err = err ? : err2;
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: RO mount of ext4 filesystem causes writes
  2023-06-23 14:34         ` Theodore Ts'o
@ 2023-06-23 15:38           ` Ritesh Harjani
  2023-06-24 19:39             ` Theodore Ts'o
  2023-06-23 16:53           ` Sean Greenslade
  1 sibling, 1 reply; 9+ messages in thread
From: Ritesh Harjani @ 2023-06-23 15:38 UTC (permalink / raw)
  To: Theodore Ts'o, Sean Greenslade
  Cc: Bagas Sanjaya, linux-ext4, Ye Bin, Thorsten Leemhuis,
	Linux Kernel Mailing List, Linux Regressions

"Theodore Ts'o" <tytso@mit.edu> writes:

> On Thu, Jun 22, 2023 at 11:18:04PM -0700, Sean Greenslade wrote:
>> I perhaps should have been more explicit in my report. The issue is not
>> that the image is any different after the mount; indeed, the md5sums are
>> identical before and after on my machine as well. The issue is that
>> something is issuing writes to the backing image, which bumps the mtime
>> of the backing image. When handling the images with rsync, a difference
>> in mtime causes the whole image to need to be read.
>
> Ah, yes, your initial report said "small writes", but it didn't
> specify whether the issue was that writes were modifying the image, or
> just simply touching the mtime field of the backing file.  I assume
> these must be largish fs images, since it must have made the increased
> rsync time noticeable?
>
> This appears to fix the problem for me, given the clarified
> reproduction information.  Could you please try it on your end?
>
> 	     		   	     	    - Ted
>
> From 6bb438fa0aac4c08acd626d408cb6d4b745df7fd Mon Sep 17 00:00:00 2001
> From: Theodore Ts'o <tytso@mit.edu>
> Date: Fri, 23 Jun 2023 10:18:51 -0400
> Subject: [PATCH] ext4: avoid updating the superblock on a r/o mount if not
>  needed
>
> This was noticed by a user who noticied that the mtime of a file
> backing a loopback device was getting bumped when the loopback device
> is mounted read/only.  Note: This doesn't show up when doing a
> loopback mount of a file directly, via "mount -o ro /tmp/foo.img
> /mnt", since the loop device is set read-only when mount automatically
> creates loop device.  However, this is noticeable for a LUKS loop
> device like this:
>
> % cryptsetup luksOpen /tmp/foo.img test
> % mount -o ro /dev/loop0 /mnt ; umount /mnt
>
> or, if LUKS is not in use, if the user manually creates the loop
> device like this:
>
> % losetup /dev/loop0 /tmp/foo.img
> % mount -o ro /dev/loop0 /mnt ; umount /mnt
>
> The modified mtime causes rsync to do a rolling checksum scan of the
> file on the local and remote side, incrementally increasing the time
> to rsync the not-modified-but-touched image file.
>
> Fixes: eee00237fa5e ("ext4: commit super block if fs record error when journal record without error")
> Cc: stable@kernel.org
> Link: https://lore.kernel.org/r/ZIauBR7YiV3rVAHL@glitch
> Reported-by: Sean Greenslade <sean@seangreenslade.com>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> ---
>  fs/ext4/super.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index b3819e70093e..c638b0db3b2b 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -5997,19 +5997,27 @@ static int ext4_load_journal(struct super_block *sb,
>  		err = jbd2_journal_wipe(journal, !really_read_only);
>  	if (!err) {
>  		char *save = kmalloc(EXT4_S_ERR_LEN, GFP_KERNEL);
> +		__le16 orig_state;
> +		bool changed = false;
>  
>  		if (save)
>  			memcpy(save, ((char *) es) +
>  			       EXT4_S_ERR_START, EXT4_S_ERR_LEN);
>  		err = jbd2_journal_load(journal);
> -		if (save)
> +		if (save && memcmp(((char *) es) + EXT4_S_ERR_START,
> +				   save, EXT4_S_ERR_LEN)) {
>  			memcpy(((char *) es) + EXT4_S_ERR_START,
>  			       save, EXT4_S_ERR_LEN);
> +			changed = true;
> +		}

It seems in the original code what we were trying to do was to preseve
the error information area of superblock across journal load (which I am
not sure why though?)

In the new code we see if the journal load changed that area and if yes
we change that back to original log but we also marked changed = true. Why?

>  		kfree(save);
> +		orig_state = es->s_state;
>  		es->s_state |= cpu_to_le16(EXT4_SB(sb)->s_mount_state &
>  					   EXT4_ERROR_FS);
> +		if (orig_state != es->s_state)
> +			changed = true;
>  		/* Write out restored error information to the superblock */
> -		if (!bdev_read_only(sb->s_bdev)) {
> +		if (changed && !really_read_only) {
>  			int err2;
>  			err2 = ext4_commit_super(sb);
>  			err = err ? : err2;

Yes, this make sense. Earlier we were always doing ext4_commit_super()
even if es->s_state hasn't changed. But this code we only do
ext4_commit_super when there is a es->s_state change from orig_state.

-ritesh

> -- 
> 2.31.0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RO mount of ext4 filesystem causes writes
  2023-06-23 14:34         ` Theodore Ts'o
  2023-06-23 15:38           ` Ritesh Harjani
@ 2023-06-23 16:53           ` Sean Greenslade
  1 sibling, 0 replies; 9+ messages in thread
From: Sean Greenslade @ 2023-06-23 16:53 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Bagas Sanjaya, linux-ext4, Ye Bin, Thorsten Leemhuis,
	Linux Kernel Mailing List, Linux Regressions

On Fri, Jun 23, 2023 at 10:34:11AM -0400, Theodore Ts'o wrote:
> Ah, yes, your initial report said "small writes", but it didn't
> specify whether the issue was that writes were modifying the image, or
> just simply touching the mtime field of the backing file.  I assume
> these must be largish fs images, since it must have made the increased
> rsync time noticeable?

Yeah, these are ~30 GB images on spinning drives, so it's the difference
between near-instant and several minutes.

> This appears to fix the problem for me, given the clarified
> reproduction information.  Could you please try it on your end?
> 
> 	     		   	     	    - Ted

Yup, this patch does indeed solve the problem I'm seeing. Thanks!

--Sean

> From 6bb438fa0aac4c08acd626d408cb6d4b745df7fd Mon Sep 17 00:00:00 2001
> From: Theodore Ts'o <tytso@mit.edu>
> Date: Fri, 23 Jun 2023 10:18:51 -0400
> Subject: [PATCH] ext4: avoid updating the superblock on a r/o mount if not
>  needed
> 
> This was noticed by a user who noticied that the mtime of a file
> backing a loopback device was getting bumped when the loopback device
> is mounted read/only.  Note: This doesn't show up when doing a
> loopback mount of a file directly, via "mount -o ro /tmp/foo.img
> /mnt", since the loop device is set read-only when mount automatically
> creates loop device.  However, this is noticeable for a LUKS loop
> device like this:
> 
> % cryptsetup luksOpen /tmp/foo.img test
> % mount -o ro /dev/loop0 /mnt ; umount /mnt
> 
> or, if LUKS is not in use, if the user manually creates the loop
> device like this:
> 
> % losetup /dev/loop0 /tmp/foo.img
> % mount -o ro /dev/loop0 /mnt ; umount /mnt
> 
> The modified mtime causes rsync to do a rolling checksum scan of the
> file on the local and remote side, incrementally increasing the time
> to rsync the not-modified-but-touched image file.
> 
> Fixes: eee00237fa5e ("ext4: commit super block if fs record error when journal record without error")
> Cc: stable@kernel.org
> Link: https://lore.kernel.org/r/ZIauBR7YiV3rVAHL@glitch
> Reported-by: Sean Greenslade <sean@seangreenslade.com>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> ---
>  fs/ext4/super.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index b3819e70093e..c638b0db3b2b 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -5997,19 +5997,27 @@ static int ext4_load_journal(struct super_block *sb,
>  		err = jbd2_journal_wipe(journal, !really_read_only);
>  	if (!err) {
>  		char *save = kmalloc(EXT4_S_ERR_LEN, GFP_KERNEL);
> +		__le16 orig_state;
> +		bool changed = false;
>  
>  		if (save)
>  			memcpy(save, ((char *) es) +
>  			       EXT4_S_ERR_START, EXT4_S_ERR_LEN);
>  		err = jbd2_journal_load(journal);
> -		if (save)
> +		if (save && memcmp(((char *) es) + EXT4_S_ERR_START,
> +				   save, EXT4_S_ERR_LEN)) {
>  			memcpy(((char *) es) + EXT4_S_ERR_START,
>  			       save, EXT4_S_ERR_LEN);
> +			changed = true;
> +		}
>  		kfree(save);
> +		orig_state = es->s_state;
>  		es->s_state |= cpu_to_le16(EXT4_SB(sb)->s_mount_state &
>  					   EXT4_ERROR_FS);
> +		if (orig_state != es->s_state)
> +			changed = true;
>  		/* Write out restored error information to the superblock */
> -		if (!bdev_read_only(sb->s_bdev)) {
> +		if (changed && !really_read_only) {
>  			int err2;
>  			err2 = ext4_commit_super(sb);
>  			err = err ? : err2;
> -- 
> 2.31.0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RO mount of ext4 filesystem causes writes
  2023-06-23 15:38           ` Ritesh Harjani
@ 2023-06-24 19:39             ` Theodore Ts'o
  0 siblings, 0 replies; 9+ messages in thread
From: Theodore Ts'o @ 2023-06-24 19:39 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: Sean Greenslade, Bagas Sanjaya, linux-ext4, Ye Bin,
	Thorsten Leemhuis, Linux Kernel Mailing List, Linux Regressions

On Fri, Jun 23, 2023 at 09:08:37PM +0530, Ritesh Harjani wrote:
> It seems in the original code what we were trying to do was to preseve
> the error information area of superblock across journal load (which I am
> not sure why though?)
> 
> In the new code we see if the journal load changed that area and if yes
> we change that back to original log but we also marked changed = true. Why?

That's a good question; thanks for asking it.  The first part of this
code was introduced by commit 1c13d5c08728 ("ext4: Save error
information to the superblock for analysis") in 2010.  So that part of
the code is not "new", but very, very, old. 

The basic idea here was that back then, when a file system error was
detected, it was always written directly to the superblock, by passing
the journal.  So that's why the original code saved the error
information, replayed the journal and then restored it.

Of course, this changed with commit 2d01ddc86606 ("ext4: save error
info to sb through journal if available") in 2020.  But the problem is
"if available".  If the jbd2 layer has shut down, then we can't route
the superblock error updates through the journal, at which point ext4
will do a direct update of the superlbock.

This was the rational behind commit eee00237fa5e ("ext4: commit super
block if fs record error when journal record without error").
Sometimes the error bit EXT4_ERROR_FS is set via a direct write to the
superblock; but other times the error bit might be set via the
journal.  In the former case, after we do a journal replay, the error
bit will be cleared.  However, since the kernel never clears the
EXT4_ERROR_FS bit, it's pretty easy for commit eee00237fa5e to handle
things.

So what we need to for that first part of the code, introduced in
commit 1c13d5c08728 and made invalid in commit 2d01ddc86606 is we need
to add code to examine s_last_error_time.  If the version that was
originally in the superblock is newer than the version found after the
journal replay, then presumably an error happened but ext4 wasn't able
to write the error information out through the journal, and we need to
replace it after the the call to jbd2_journal_load().

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-06-24 19:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <ZIauBR7YiV3rVAHL@glitch>
2023-06-12  6:20 ` RO mount of ext4 filesystem causes writes Bagas Sanjaya
2023-06-13  4:48   ` Sean Greenslade
2023-06-23  1:06   ` Bagas Sanjaya
2023-06-23  4:46     ` Theodore Ts'o
2023-06-23  6:18       ` Sean Greenslade
2023-06-23 14:34         ` Theodore Ts'o
2023-06-23 15:38           ` Ritesh Harjani
2023-06-24 19:39             ` Theodore Ts'o
2023-06-23 16:53           ` Sean Greenslade

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).