linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Query about ext4 commit interval vs dirty_expire_centisecs
@ 2019-11-19  8:47 Paul Richards
  2019-12-13 15:59 ` Jan Kara
  0 siblings, 1 reply; 6+ messages in thread
From: Paul Richards @ 2019-11-19  8:47 UTC (permalink / raw)
  To: linux-ext4

Hello there,
I'm trying to understand the interaction between the ext4 `commit`
interval option, and the `vm.dirty_expire_centisecs` tuneable.

The ext4 `commit` documentation says:

> Ext4 can be told to sync all its data and metadata every 'nrsec' seconds. The default value is 5 seconds. This means that if you lose your power, you will lose as much as the latest 5 seconds of work (your filesystem will not be damaged though, thanks to the journaling).

The `dirty_expire_centisecs` documentation says:

> This tunable is used to define when dirty data is old enough to be eligible for writeout by the kernel flusher threads. It is expressed in 100'ths of a second. Data which has been dirty in-memory for longer than this interval will be written out next time a flusher thread wakes up.


Superficially these sound like they have a very similar effect.  They
periodically flush out data that hasn't been explicitly fsync'd by the
application.  I'd like to understand a bit more the interaction
between these.


What happens when the ext4 commit interval is shorter than the
dirty_expire_centisecs setting?  (Does the latter become "redundant"?)

What happens when the dirty_expire_centisecs setting is shorter than
the ext4 commit interval?

Thanks,

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Query about ext4 commit interval vs dirty_expire_centisecs
  2019-11-19  8:47 Query about ext4 commit interval vs dirty_expire_centisecs Paul Richards
@ 2019-12-13 15:59 ` Jan Kara
  2019-12-17 14:42   ` Paul Richards
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Kara @ 2019-12-13 15:59 UTC (permalink / raw)
  To: Paul Richards; +Cc: linux-ext4

Hello!

On Tue 19-11-19 08:47:31, Paul Richards wrote:
> I'm trying to understand the interaction between the ext4 `commit`
> interval option, and the `vm.dirty_expire_centisecs` tuneable.
> 
> The ext4 `commit` documentation says:
> 
> > Ext4 can be told to sync all its data and metadata every 'nrsec' seconds. The default value is 5 seconds. This means that if you lose your power, you will lose as much as the latest 5 seconds of work (your filesystem will not be damaged though, thanks to the journaling).
> 
> The `dirty_expire_centisecs` documentation says:
> 
> > This tunable is used to define when dirty data is old enough to be eligible for writeout by the kernel flusher threads. It is expressed in 100'ths of a second. Data which has been dirty in-memory for longer than this interval will be written out next time a flusher thread wakes up.
> 
> 
> Superficially these sound like they have a very similar effect.  They
> periodically flush out data that hasn't been explicitly fsync'd by the
> application.  I'd like to understand a bit more the interaction
> between these.

Yes, the effect is rather similar but not quite the same. The first thing
to observe is kind of obvious fact that ext4 commit interval influences
just the particular filesystem while dirty_expire_centisecs influences
behavior of global writeback over all filesystems.

Secondly, commit interval is really the maximum age of ext4 transation.  So
if there is metadata change pending in the journal, it will become
persistent at latest after this time. So for say 'mkdir' that will be
persistent at latest after this time. For data operations things are more
complex. E.g. when delayed allocation is used (which is the default), the
change gets logged in the journal only during writeback. So it can take up
to dirty_expire_centisecs for data to be written back from page cache, that
results in filesystem journalling block allocations etc. and then it can
take upto commit interval for these changes to become persistent. So in
this case the intervals add up. There are also other special cases
somewhere in between but generally it is reasonable to assume that data gets
automatically persistent in dirty_expire_centisecs + commit_interval time.
Note both these times are actually times when writeback is triggered so
if the disk gets too busy, the actual time when data is completely on disk
may be much higher.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Query about ext4 commit interval vs dirty_expire_centisecs
  2019-12-13 15:59 ` Jan Kara
@ 2019-12-17 14:42   ` Paul Richards
  2019-12-18  8:33     ` Jan Kara
  0 siblings, 1 reply; 6+ messages in thread
From: Paul Richards @ 2019-12-17 14:42 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4

On Fri, 13 Dec 2019 at 15:59, Jan Kara <jack@suse.cz> wrote:
>
> Hello!
>
> On Tue 19-11-19 08:47:31, Paul Richards wrote:
> > I'm trying to understand the interaction between the ext4 `commit`
> > interval option, and the `vm.dirty_expire_centisecs` tuneable.
> >
> > The ext4 `commit` documentation says:
> >
> > > Ext4 can be told to sync all its data and metadata every 'nrsec' seconds. The default value is 5 seconds. This means that if you lose your power, you will lose as much as the latest 5 seconds of work (your filesystem will not be damaged though, thanks to the journaling).
> >
> > The `dirty_expire_centisecs` documentation says:
> >
> > > This tunable is used to define when dirty data is old enough to be eligible for writeout by the kernel flusher threads. It is expressed in 100'ths of a second. Data which has been dirty in-memory for longer than this interval will be written out next time a flusher thread wakes up.
> >
> >
> > Superficially these sound like they have a very similar effect.  They
> > periodically flush out data that hasn't been explicitly fsync'd by the
> > application.  I'd like to understand a bit more the interaction
> > between these.
>
> Yes, the effect is rather similar but not quite the same. The first thing
> to observe is kind of obvious fact that ext4 commit interval influences
> just the particular filesystem while dirty_expire_centisecs influences
> behavior of global writeback over all filesystems.
>
> Secondly, commit interval is really the maximum age of ext4 transation.  So
> if there is metadata change pending in the journal, it will become
> persistent at latest after this time. So for say 'mkdir' that will be
> persistent at latest after this time. For data operations things are more
> complex. E.g. when delayed allocation is used (which is the default), the
> change gets logged in the journal only during writeback. So it can take up
> to dirty_expire_centisecs for data to be written back from page cache, that
> results in filesystem journalling block allocations etc. and then it can
> take upto commit interval for these changes to become persistent. So in
> this case the intervals add up. There are also other special cases
> somewhere in between but generally it is reasonable to assume that data gets
> automatically persistent in dirty_expire_centisecs + commit_interval time.
> Note both these times are actually times when writeback is triggered so
> if the disk gets too busy, the actual time when data is completely on disk
> may be much higher.
>

Thanks for taking the time to reply!

Since automatic persisting of data occurs only after
dirty_expire_centisecs + commit_interval,
should the ext4 docs be corrected?  They currently state (for the
commit interval option):

"The default value is 5 seconds. This means that if you lose
your power, you will lose as much as the latest 5 seconds of work"

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Query about ext4 commit interval vs dirty_expire_centisecs
  2019-12-17 14:42   ` Paul Richards
@ 2019-12-18  8:33     ` Jan Kara
  2019-12-18 10:35       ` Paul Richards
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Kara @ 2019-12-18  8:33 UTC (permalink / raw)
  To: Paul Richards; +Cc: Jan Kara, linux-ext4

On Tue 17-12-19 14:42:48, Paul Richards wrote:
> On Fri, 13 Dec 2019 at 15:59, Jan Kara <jack@suse.cz> wrote:
> >
> > Hello!
> >
> > On Tue 19-11-19 08:47:31, Paul Richards wrote:
> > > I'm trying to understand the interaction between the ext4 `commit`
> > > interval option, and the `vm.dirty_expire_centisecs` tuneable.
> > >
> > > The ext4 `commit` documentation says:
> > >
> > > > Ext4 can be told to sync all its data and metadata every 'nrsec' seconds. The default value is 5 seconds. This means that if you lose your power, you will lose as much as the latest 5 seconds of work (your filesystem will not be damaged though, thanks to the journaling).
> > >
> > > The `dirty_expire_centisecs` documentation says:
> > >
> > > > This tunable is used to define when dirty data is old enough to be eligible for writeout by the kernel flusher threads. It is expressed in 100'ths of a second. Data which has been dirty in-memory for longer than this interval will be written out next time a flusher thread wakes up.
> > >
> > >
> > > Superficially these sound like they have a very similar effect.  They
> > > periodically flush out data that hasn't been explicitly fsync'd by the
> > > application.  I'd like to understand a bit more the interaction
> > > between these.
> >
> > Yes, the effect is rather similar but not quite the same. The first thing
> > to observe is kind of obvious fact that ext4 commit interval influences
> > just the particular filesystem while dirty_expire_centisecs influences
> > behavior of global writeback over all filesystems.
> >
> > Secondly, commit interval is really the maximum age of ext4 transation.  So
> > if there is metadata change pending in the journal, it will become
> > persistent at latest after this time. So for say 'mkdir' that will be
> > persistent at latest after this time. For data operations things are more
> > complex. E.g. when delayed allocation is used (which is the default), the
> > change gets logged in the journal only during writeback. So it can take up
> > to dirty_expire_centisecs for data to be written back from page cache, that
> > results in filesystem journalling block allocations etc. and then it can
> > take upto commit interval for these changes to become persistent. So in
> > this case the intervals add up. There are also other special cases
> > somewhere in between but generally it is reasonable to assume that data gets
> > automatically persistent in dirty_expire_centisecs + commit_interval time.
> > Note both these times are actually times when writeback is triggered so
> > if the disk gets too busy, the actual time when data is completely on disk
> > may be much higher.
> >
> 
> Thanks for taking the time to reply!
> 
> Since automatic persisting of data occurs only after
> dirty_expire_centisecs + commit_interval,
> should the ext4 docs be corrected?  They currently state (for the
> commit interval option):
> 
> "The default value is 5 seconds. This means that if you lose
> your power, you will lose as much as the latest 5 seconds of work"

Yes, probably that should be clarified. Where did you find this wording?
Because my ext4 manpage just states:

        commit=nrsec
              Start  a  journal commit every nrsec seconds.  The default value
              is 5 seconds.  Zero means default.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Query about ext4 commit interval vs dirty_expire_centisecs
  2019-12-18  8:33     ` Jan Kara
@ 2019-12-18 10:35       ` Paul Richards
  2019-12-18 11:22         ` Jan Kara
  0 siblings, 1 reply; 6+ messages in thread
From: Paul Richards @ 2019-12-18 10:35 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4

I found it here:
https://www.kernel.org/doc/Documentation/filesystems/ext4.txt

I think this might be the source, but I'm not sure:
https://github.com/torvalds/linux/blame/master/Documentation/admin-guide/ext4.rst#L185-L187

While searching for this I also found a copy of the same `commit`
documentation here:
https://github.com/torvalds/linux/blob/master/Documentation/filesystems/ocfs2.txt
I don't know if the same correction should be made for ocfs2 or not.



On Wed, 18 Dec 2019 at 08:33, Jan Kara <jack@suse.cz> wrote:
>
> On Tue 17-12-19 14:42:48, Paul Richards wrote:
> > On Fri, 13 Dec 2019 at 15:59, Jan Kara <jack@suse.cz> wrote:
> > >
> > > Hello!
> > >
> > > On Tue 19-11-19 08:47:31, Paul Richards wrote:
> > > > I'm trying to understand the interaction between the ext4 `commit`
> > > > interval option, and the `vm.dirty_expire_centisecs` tuneable.
> > > >
> > > > The ext4 `commit` documentation says:
> > > >
> > > > > Ext4 can be told to sync all its data and metadata every 'nrsec' seconds. The default value is 5 seconds. This means that if you lose your power, you will lose as much as the latest 5 seconds of work (your filesystem will not be damaged though, thanks to the journaling).
> > > >
> > > > The `dirty_expire_centisecs` documentation says:
> > > >
> > > > > This tunable is used to define when dirty data is old enough to be eligible for writeout by the kernel flusher threads. It is expressed in 100'ths of a second. Data which has been dirty in-memory for longer than this interval will be written out next time a flusher thread wakes up.
> > > >
> > > >
> > > > Superficially these sound like they have a very similar effect.  They
> > > > periodically flush out data that hasn't been explicitly fsync'd by the
> > > > application.  I'd like to understand a bit more the interaction
> > > > between these.
> > >
> > > Yes, the effect is rather similar but not quite the same. The first thing
> > > to observe is kind of obvious fact that ext4 commit interval influences
> > > just the particular filesystem while dirty_expire_centisecs influences
> > > behavior of global writeback over all filesystems.
> > >
> > > Secondly, commit interval is really the maximum age of ext4 transation.  So
> > > if there is metadata change pending in the journal, it will become
> > > persistent at latest after this time. So for say 'mkdir' that will be
> > > persistent at latest after this time. For data operations things are more
> > > complex. E.g. when delayed allocation is used (which is the default), the
> > > change gets logged in the journal only during writeback. So it can take up
> > > to dirty_expire_centisecs for data to be written back from page cache, that
> > > results in filesystem journalling block allocations etc. and then it can
> > > take upto commit interval for these changes to become persistent. So in
> > > this case the intervals add up. There are also other special cases
> > > somewhere in between but generally it is reasonable to assume that data gets
> > > automatically persistent in dirty_expire_centisecs + commit_interval time.
> > > Note both these times are actually times when writeback is triggered so
> > > if the disk gets too busy, the actual time when data is completely on disk
> > > may be much higher.
> > >
> >
> > Thanks for taking the time to reply!
> >
> > Since automatic persisting of data occurs only after
> > dirty_expire_centisecs + commit_interval,
> > should the ext4 docs be corrected?  They currently state (for the
> > commit interval option):
> >
> > "The default value is 5 seconds. This means that if you lose
> > your power, you will lose as much as the latest 5 seconds of work"
>
> Yes, probably that should be clarified. Where did you find this wording?
> Because my ext4 manpage just states:
>
>         commit=nrsec
>               Start  a  journal commit every nrsec seconds.  The default value
>               is 5 seconds.  Zero means default.
>
>                                                                 Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Query about ext4 commit interval vs dirty_expire_centisecs
  2019-12-18 10:35       ` Paul Richards
@ 2019-12-18 11:22         ` Jan Kara
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2019-12-18 11:22 UTC (permalink / raw)
  To: Paul Richards; +Cc: Jan Kara, linux-ext4

On Wed 18-12-19 10:35:56, Paul Richards wrote:
> I found it here:
> https://www.kernel.org/doc/Documentation/filesystems/ext4.txt
> 
> I think this might be the source, but I'm not sure:
> https://github.com/torvalds/linux/blame/master/Documentation/admin-guide/ext4.rst#L185-L187

Yes, that's it. Somehow my grep capabilities failed me :-| Thanks for the
pointer.

> While searching for this I also found a copy of the same `commit`
> documentation here:
> https://github.com/torvalds/linux/blob/master/Documentation/filesystems/ocfs2.txt
> I don't know if the same correction should be made for ocfs2 or not.

For OCFS2 it is somewhat different since it doesn't do delayed allocation.
So the text is actually correct for file creation. It is still incorrect
for file overwrites though on which commit interval has no effect.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-12-18 11:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-19  8:47 Query about ext4 commit interval vs dirty_expire_centisecs Paul Richards
2019-12-13 15:59 ` Jan Kara
2019-12-17 14:42   ` Paul Richards
2019-12-18  8:33     ` Jan Kara
2019-12-18 10:35       ` Paul Richards
2019-12-18 11:22         ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).