linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Inode 2885482 (000000008e814f64): i_reserved_data_blocks (2) not cleared!
@ 2021-10-14 12:54 Rantala, Tommi T. (Nokia - FI/Espoo)
  2021-10-14 18:06 ` Gao Xiang
  0 siblings, 1 reply; 5+ messages in thread
From: Rantala, Tommi T. (Nokia - FI/Espoo) @ 2021-10-14 12:54 UTC (permalink / raw)
  To: jefflexu, enwlinux, hsiangkao, tytso; +Cc: linux-ext4, linux-kernel

Hi,

I'm seeing these i_reserved_data_blocks not cleared! messages when using ext4
with nodelalloc, message added in:

  commit 6fed83957f21eff11c8496e9f24253b03d2bc1dc
  Author: Jeffle Xu <jefflexu@linux.alibaba.com>
  Date:   Mon Aug 23 14:13:58 2021 +0800

      ext4: fix reserved space counter leakage

I can quickly reproduce in 5.15.0-rc5-00041-g348949d9a444 by doing some
filesystem I/O while toggling delalloc:


while true; do mount -o remount,nodelalloc /; sleep 1; mount -o remount,delalloc /; sleep 1; done &
git clone linux xxx; rm -rf xxx

[  222.928341] EXT4-fs (vdb1): re-mounted. Opts: delalloc. Quota mode: disabled.
[  223.932516] EXT4-fs (vdb1): re-mounted. Opts: nodelalloc. Quota mode: disabled.
[  224.183741] EXT4-fs (vdb1): Inode 2885482 (000000008e814f64): i_reserved_data_blocks (2) not cleared!
[  224.185064] EXT4-fs (vdb1): Inode 2885478 (00000000862b48ad): i_reserved_data_blocks (2) not cleared!
[  224.186434] EXT4-fs (vdb1): Inode 2885474 (00000000a20bdd95): i_reserved_data_blocks (7) not cleared!
[  224.187649] EXT4-fs (vdb1): Inode 2885476 (00000000028005e1): i_reserved_data_blocks (2) not cleared!
[  224.189016] EXT4-fs (vdb1): Inode 2885475 (0000000025d9617d): i_reserved_data_blocks (2) not cleared!
[  224.190370] EXT4-fs (vdb1): Inode 2885480 (00000000d0722d90): i_reserved_data_blocks (7) not cleared!
[  224.191732] EXT4-fs (vdb1): Inode 2885481 (000000009b50d6cb): i_reserved_data_blocks (1) not cleared!
[  224.193093] EXT4-fs (vdb1): Inode 2885472 (00000000fe907f54): i_reserved_data_blocks (1) not cleared!
[  227.946984] EXT4-fs: 9213 callbacks suppressed
[  227.946989] EXT4-fs (vdb1): re-mounted. Opts: nodelalloc. Quota mode: disabled.


-Tommi


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Inode 2885482 (000000008e814f64): i_reserved_data_blocks (2) not cleared!
  2021-10-14 12:54 Inode 2885482 (000000008e814f64): i_reserved_data_blocks (2) not cleared! Rantala, Tommi T. (Nokia - FI/Espoo)
@ 2021-10-14 18:06 ` Gao Xiang
  2021-10-14 21:57   ` Theodore Ts'o
  0 siblings, 1 reply; 5+ messages in thread
From: Gao Xiang @ 2021-10-14 18:06 UTC (permalink / raw)
  To: Rantala, Tommi T. (Nokia - FI/Espoo)
  Cc: jefflexu, enwlinux, tytso, linux-ext4, linux-kernel

On Thu, Oct 14, 2021 at 12:54:14PM +0000, Rantala, Tommi T. (Nokia - FI/Espoo) wrote:
> Hi,
> 
> I'm seeing these i_reserved_data_blocks not cleared! messages when using ext4
> with nodelalloc, message added in:
> 
>   commit 6fed83957f21eff11c8496e9f24253b03d2bc1dc
>   Author: Jeffle Xu <jefflexu@linux.alibaba.com>
>   Date:   Mon Aug 23 14:13:58 2021 +0800
> 
>       ext4: fix reserved space counter leakage
> 
> I can quickly reproduce in 5.15.0-rc5-00041-g348949d9a444 by doing some
> filesystem I/O while toggling delalloc:
> 
> 
> while true; do mount -o remount,nodelalloc /; sleep 1; mount -o remount,delalloc /; sleep 1; done &
> git clone linux xxx; rm -rf xxx

If I understand correctly, switching such option implies
sync inodes to write back exist delayed allocation blocks.

At a glance I don't find it. Yet no test actually.

Thanks,
Gao Xiang

> 
> [  222.928341] EXT4-fs (vdb1): re-mounted. Opts: delalloc. Quota mode: disabled.
> [  223.932516] EXT4-fs (vdb1): re-mounted. Opts: nodelalloc. Quota mode: disabled.
> [  224.183741] EXT4-fs (vdb1): Inode 2885482 (000000008e814f64): i_reserved_data_blocks (2) not cleared!
> [  224.185064] EXT4-fs (vdb1): Inode 2885478 (00000000862b48ad): i_reserved_data_blocks (2) not cleared!
> [  224.186434] EXT4-fs (vdb1): Inode 2885474 (00000000a20bdd95): i_reserved_data_blocks (7) not cleared!
> [  224.187649] EXT4-fs (vdb1): Inode 2885476 (00000000028005e1): i_reserved_data_blocks (2) not cleared!
> [  224.189016] EXT4-fs (vdb1): Inode 2885475 (0000000025d9617d): i_reserved_data_blocks (2) not cleared!
> [  224.190370] EXT4-fs (vdb1): Inode 2885480 (00000000d0722d90): i_reserved_data_blocks (7) not cleared!
> [  224.191732] EXT4-fs (vdb1): Inode 2885481 (000000009b50d6cb): i_reserved_data_blocks (1) not cleared!
> [  224.193093] EXT4-fs (vdb1): Inode 2885472 (00000000fe907f54): i_reserved_data_blocks (1) not cleared!
> [  227.946984] EXT4-fs: 9213 callbacks suppressed
> [  227.946989] EXT4-fs (vdb1): re-mounted. Opts: nodelalloc. Quota mode: disabled.
> 
> 
> -Tommi
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Inode 2885482 (000000008e814f64): i_reserved_data_blocks (2) not cleared!
  2021-10-14 18:06 ` Gao Xiang
@ 2021-10-14 21:57   ` Theodore Ts'o
  2021-10-15  2:49     ` Gao Xiang
  2021-10-18  3:54     ` JeffleXu
  0 siblings, 2 replies; 5+ messages in thread
From: Theodore Ts'o @ 2021-10-14 21:57 UTC (permalink / raw)
  To: Gao Xiang
  Cc: Rantala, Tommi T. (Nokia - FI/Espoo),
	jefflexu, enwlinux, linux-ext4, linux-kernel

On Fri, Oct 15, 2021 at 02:06:52AM +0800, Gao Xiang wrote:
> On Thu, Oct 14, 2021 at 12:54:14PM +0000, Rantala, Tommi T. (Nokia - FI/Espoo) wrote:
> > Hi,
> > 
> > I'm seeing these i_reserved_data_blocks not cleared! messages when using ext4
> > with nodelalloc, message added in:
> > 
> >   commit 6fed83957f21eff11c8496e9f24253b03d2bc1dc
> >   Author: Jeffle Xu <jefflexu@linux.alibaba.com>
> >   Date:   Mon Aug 23 14:13:58 2021 +0800
> > 
> >       ext4: fix reserved space counter leakage
> > 
> > I can quickly reproduce in 5.15.0-rc5-00041-g348949d9a444 by doing some
> > filesystem I/O while toggling delalloc:
> > 
> > 
> > while true; do mount -o remount,nodelalloc /; sleep 1; mount -o remount,delalloc /; sleep 1; done &
> > git clone linux xxx; rm -rf xxx
> 
> If I understand correctly, switching such option implies
> sync inodes to write back exist delayed allocation blocks.

Well, no.  What it implies is that all writes after the remount into
an unallocated portion of the file will be allocated at the time when
the page is dirtied, instead of when the page is written back.  It's
possible for some pages to be written using delayed allocation, and
some other pages in the legacy "allocate on page dirty" mechanism.
This can happen when the file system is remounted; it can also happen
when the file system starts getting close to 100% full.  See the
comment in ext4_nonda_switch:

	/*
	 * switch to non delalloc mode if we are running low
	 * on free block. The free block accounting via percpu
	 * counters can get slightly wrong with percpu_counter_batch getting
	 * accumulated on each CPU without updating global counters
	 * Delalloc need an accurate free block accounting. So switch
	 * to non delalloc when we are near to error range.
	 */

Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Inode 2885482 (000000008e814f64): i_reserved_data_blocks (2) not cleared!
  2021-10-14 21:57   ` Theodore Ts'o
@ 2021-10-15  2:49     ` Gao Xiang
  2021-10-18  3:54     ` JeffleXu
  1 sibling, 0 replies; 5+ messages in thread
From: Gao Xiang @ 2021-10-15  2:49 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Rantala, Tommi T. (Nokia - FI/Espoo),
	jefflexu, enwlinux, linux-ext4, linux-kernel

On Thu, Oct 14, 2021 at 05:57:32PM -0400, Theodore Ts'o wrote:
> On Fri, Oct 15, 2021 at 02:06:52AM +0800, Gao Xiang wrote:
> > On Thu, Oct 14, 2021 at 12:54:14PM +0000, Rantala, Tommi T. (Nokia - FI/Espoo) wrote:
> > > Hi,
> > > 
> > > I'm seeing these i_reserved_data_blocks not cleared! messages when using ext4
> > > with nodelalloc, message added in:
> > > 
> > >   commit 6fed83957f21eff11c8496e9f24253b03d2bc1dc
> > >   Author: Jeffle Xu <jefflexu@linux.alibaba.com>
> > >   Date:   Mon Aug 23 14:13:58 2021 +0800
> > > 
> > >       ext4: fix reserved space counter leakage
> > > 
> > > I can quickly reproduce in 5.15.0-rc5-00041-g348949d9a444 by doing some
> > > filesystem I/O while toggling delalloc:
> > > 
> > > 
> > > while true; do mount -o remount,nodelalloc /; sleep 1; mount -o remount,delalloc /; sleep 1; done &
> > > git clone linux xxx; rm -rf xxx
> > 
> > If I understand correctly, switching such option implies
> > sync inodes to write back exist delayed allocation blocks.
> 
> Well, no.  What it implies is that all writes after the remount into
> an unallocated portion of the file will be allocated at the time when
> the page is dirtied, instead of when the page is written back.  It's
> possible for some pages to be written using delayed allocation, and
> some other pages in the legacy "allocate on page dirty" mechanism.
> This can happen when the file system is remounted; it can also happen
> when the file system starts getting close to 100% full.  See the
> comment in ext4_nonda_switch:
> 
> 	/*
> 	 * switch to non delalloc mode if we are running low
> 	 * on free block. The free block accounting via percpu
> 	 * counters can get slightly wrong with percpu_counter_batch getting
> 	 * accumulated on each CPU without updating global counters
> 	 * Delalloc need an accurate free block accounting. So switch
> 	 * to non delalloc when we are near to error range.
> 	 */

Hi Ted,

Ok, thanks for the detailed behavior explanation yet I guess several
checks of "test_opt(inode->i_sb, DELALLOC)" could be somewhat racy
then? For example a check in __es_remove_extent() of extents_status.c?

Thanks,
Gao Xiang

> 
> Cheers,
> 
> 					- Ted

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Inode 2885482 (000000008e814f64): i_reserved_data_blocks (2) not cleared!
  2021-10-14 21:57   ` Theodore Ts'o
  2021-10-15  2:49     ` Gao Xiang
@ 2021-10-18  3:54     ` JeffleXu
  1 sibling, 0 replies; 5+ messages in thread
From: JeffleXu @ 2021-10-18  3:54 UTC (permalink / raw)
  To: Theodore Ts'o, Gao Xiang
  Cc: Rantala, Tommi T. (Nokia - FI/Espoo), enwlinux, linux-ext4, linux-kernel



On 10/15/21 5:57 AM, Theodore Ts'o wrote:
> On Fri, Oct 15, 2021 at 02:06:52AM +0800, Gao Xiang wrote:
>> On Thu, Oct 14, 2021 at 12:54:14PM +0000, Rantala, Tommi T. (Nokia - FI/Espoo) wrote:
>>> Hi,
>>>
>>> I'm seeing these i_reserved_data_blocks not cleared! messages when using ext4
>>> with nodelalloc, message added in:
>>>
>>>   commit 6fed83957f21eff11c8496e9f24253b03d2bc1dc
>>>   Author: Jeffle Xu <jefflexu@linux.alibaba.com>
>>>   Date:   Mon Aug 23 14:13:58 2021 +0800
>>>
>>>       ext4: fix reserved space counter leakage
>>>
>>> I can quickly reproduce in 5.15.0-rc5-00041-g348949d9a444 by doing some
>>> filesystem I/O while toggling delalloc:
>>>
>>>
>>> while true; do mount -o remount,nodelalloc /; sleep 1; mount -o remount,delalloc /; sleep 1; done &
>>> git clone linux xxx; rm -rf xxx
>>
>> If I understand correctly, switching such option implies
>> sync inodes to write back exist delayed allocation blocks.
> 
> Well, no.  What it implies is that all writes after the remount into
> an unallocated portion of the file will be allocated at the time when
> the page is dirtied, instead of when the page is written back.  It's
> possible for some pages to be written using delayed allocation, and
> some other pages in the legacy "allocate on page dirty" mechanism.
> This can happen when the file system is remounted; it can also happen
> when the file system starts getting close to 100% full.  See the
> comment in ext4_nonda_switch:
> 
> 	/*
> 	 * switch to non delalloc mode if we are running low
> 	 * on free block. The free block accounting via percpu
> 	 * counters can get slightly wrong with percpu_counter_batch getting
> 	 * accumulated on each CPU without updating global counters
> 	 * Delalloc need an accurate free block accounting. So switch
> 	 * to non delalloc when we are near to error range.
> 	 */
> 

So it seems possible that s_dirtyclusters_counter/i_reserved_data_blocks
counters are not maintained anymore when filesystem gets remounted from
'delalloc' to 'nodelalloc', even when you're writing back a (previously)
delay allocated page cache (when it's still mounted as 'delalloc'). Thus
it is possible that s_dirtyclusters_counter/i_reserved_data_blocks
counters are non-zero when the inode is finally evicted and destroyed.

IMHO I think this inconsistency is problematic. For example, when
filesystem gets remounted from 'delalloc' to 'nodelalloc' and then runs
for a period, s_dirtyclusters_counter/i_reserved_data_blocks counters
already gets inconsistent. Then it's remounted back to 'delalloc', in
which case s_dirtyclusters_counter/i_reserved_data_blocks counters are
already incorrect.



-- 
Thanks,
Jeffle

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-10-18  3:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-14 12:54 Inode 2885482 (000000008e814f64): i_reserved_data_blocks (2) not cleared! Rantala, Tommi T. (Nokia - FI/Espoo)
2021-10-14 18:06 ` Gao Xiang
2021-10-14 21:57   ` Theodore Ts'o
2021-10-15  2:49     ` Gao Xiang
2021-10-18  3:54     ` JeffleXu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).