linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: Linux Kernel Developers List <linux-kernel@vger.kernel.org>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>
Subject: Re: [GIT PULL] Ext3 latency fixes
Date: Thu, 9 Apr 2009 08:49:27 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LFD.2.00.0904090836060.4583@localhost.localdomain> (raw)
In-Reply-To: <E1LrhNF-0002zd-BG@closure.thunk.org>



On Wed, 8 Apr 2009, Theodore Ts'o wrote:
> 
> One of these patches fixes a performance regression caused by a64c8610,
> which unplugged the write queue after every page write.  Now that Jens
> added WRITE_SYNC_PLUG.the patch causes us to use it instead of
> WRITE_SYNC, to avoid the implicit unplugging.  These patches also seem
> to further improbve ext3 latency, especially during the "sync" command
> in Linus's write-big-file-and-sync workload.

So here's a question and a untested _conceptual_ patch. 

The kind of writeback mode I'd personally prefer would be more of a 
mixture of the current "data=writeback" and "data=ordered" modes, with 
something of the best of both worlds. I'd like the data writeback to get 
_started_ when the journal is written to disk, but I'd like it to not 
block journal updates.

IOW, it wouldn't be "strictly ordered", but at the same time it wouldn't 
be totally unordered either.

For true sync operations (ie fsync()), the VFS layer then does the proper 
"wait for data" part.

I dunno. I don't actually know the JBD internal constraints, but what I'm 
talking about is something like the appended patch. It wouldn't help under 
really heavy writeback IO (because even if we don't end up waiting for all 
the random data to complete, we'd end up waiting when _submitting_ it), 
but it might help under somewhat less extreme loads.

This is totally untested. It might well violate some serious internal jbd 
rules and eat your filesystem, for all I know. I'm throwing the patch out 
as a "would something _like_ this perhaps make sense as a half-way-point 
between 'ordered' and 'writeback', nothing more.

Hmm?

		Linus
---
 fs/jbd/commit.c |   11 ++++++++++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/fs/jbd/commit.c b/fs/jbd/commit.c
index a8e8513..5bea3ed 100644
--- a/fs/jbd/commit.c
+++ b/fs/jbd/commit.c
@@ -184,6 +184,9 @@ static void journal_do_submit_data(struct buffer_head **wbuf, int bufs,
 	}
 }
 
+/* This would obviously be a real flag, set at mount time */
+#define BACKGROUND_DATA(journal) (1)
+
 /*
  *  Submit all the data buffers to disk
  */
@@ -198,6 +201,9 @@ static int journal_submit_data_buffers(journal_t *journal,
 	struct buffer_head **wbuf = journal->j_wbuf;
 	int err = 0;
 
+	if (BACKGROUND_DATA(journal))
+		write_op = WRITE;
+
 	/*
 	 * Whenever we unlock the journal and sleep, things can get added
 	 * onto ->t_sync_datalist, so we have to keep looping back to
@@ -254,7 +260,10 @@ write_out_data:
 		if (locked && test_clear_buffer_dirty(bh)) {
 			BUFFER_TRACE(bh, "needs writeout, adding to array");
 			wbuf[bufs++] = bh;
-			__journal_file_buffer(jh, commit_transaction,
+			if (BACKGROUND_DATA(journal))
+				__journal_unfile_buffer(jh);
+			else
+				__journal_file_buffer(jh, commit_transaction,
 						BJ_Locked);
 			jbd_unlock_bh_state(bh);
 			if (bufs == journal->j_wbufsize) {

  reply	other threads:[~2009-04-09 15:52 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-08 23:40 [GIT PULL] Ext3 latency fixes Theodore Ts'o
2009-04-09 15:49 ` Linus Torvalds [this message]
2009-04-09 16:23   ` Chris Mason
2009-04-09 17:49     ` Jan Kara
2009-04-09 18:10       ` Chris Mason
2009-04-09 19:04         ` Jan Kara
2009-04-14  2:29           ` [RFC] ext3 data=guarded was " Chris Mason
2009-04-09 17:36   ` Jan Kara
  -- strict thread matches above, loose matches on Subject: below --
2009-04-03  7:01 Theodore Ts'o
2009-04-03 18:24 ` Linus Torvalds
2009-04-03 18:47   ` Jens Axboe
2009-04-03 19:13     ` Theodore Tso
2009-04-03 21:01     ` Chris Mason
2009-04-03 19:02   ` Linus Torvalds
2009-04-03 20:41     ` Linus Torvalds
2009-04-04 13:57       ` Theodore Tso
2009-04-04 15:16         ` Jens Axboe
2009-04-04 15:57           ` Linus Torvalds
2009-04-04 16:06             ` Linus Torvalds
2009-04-04 17:36               ` Jens Axboe
2009-04-04 17:34             ` Jens Axboe
2009-04-04 17:44               ` Linus Torvalds
2009-04-04 18:00                 ` Trenton D. Adams
2009-04-04 18:01                 ` Jens Axboe
2009-04-04 18:10                   ` Linus Torvalds
2009-04-04 23:22                   ` Theodore Tso
2009-04-04 23:33                     ` Arjan van de Ven
2009-04-05  0:10                       ` Theodore Tso
2009-04-05 15:05                         ` Arjan van de Ven
2009-04-05 17:01                         ` Linus Torvalds
2009-04-05 17:15                           ` Mark Lord
2009-04-05 20:57                             ` Jeff Garzik
2009-04-05 23:48                               ` Arjan van de Ven
2009-04-06  2:32                                 ` Mark Lord
2009-04-06  5:47                                 ` Jeff Garzik
2009-04-07 18:18                                   ` Linus Torvalds
2009-04-07 18:22                                     ` Linus Torvalds
2009-04-06  8:13                             ` Jens Axboe
2009-04-05 18:56                           ` Arjan van de Ven
2009-04-05 19:34                             ` Linus Torvalds
2009-04-05 20:06                               ` Arjan van de Ven
2009-04-06  6:25                               ` Jens Axboe
2009-04-06  6:05                           ` Theodore Tso
2009-04-06  6:23                           ` Jens Axboe
2009-04-06  8:16                       ` Jens Axboe
2009-04-06 14:48                         ` Linus Torvalds
2009-04-06 15:09                           ` Jens Axboe
2009-04-06  6:15                     ` Jens Axboe
2009-04-04 20:18               ` Ingo Molnar
2009-04-06 21:50                 ` Lennart Sorensen
2009-04-07 13:31                   ` Mark Lord
2009-04-07 14:48                     ` Lennart Sorensen
2009-04-07 19:21                       ` Mark Lord
2009-04-07 19:57                         ` Lennart Sorensen
2009-04-04 20:56               ` Arjan van de Ven
2009-04-06  7:06                 ` Jens Axboe
2009-04-07 15:39             ` Indan Zupancic
2009-04-04 19:18           ` Theodore Tso
2009-04-06  8:12             ` Jens Axboe
2009-04-04 22:13         ` Linus Torvalds
2009-04-04 22:19           ` Linus Torvalds
2009-04-05  0:20           ` Theodore Tso
2009-04-03 19:54   ` Theodore Tso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.0904090836060.4583@localhost.localdomain \
    --to=torvalds@linux-foundation.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).