All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <chris.mason@oracle.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Pekka Enberg <penberg@kernel.org>,
	Aidar Kultayev <the.aidar@gmail.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jens Axboe <axboe@kernel.dk>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Nick Piggin <npiggin@suse.de>,
	Arjan van de Ven <arjan@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: 2.6.36 io bring the system to its knees
Date: Thu, 28 Oct 2010 13:01:32 -0400	[thread overview]
Message-ID: <20101028170132.GY27796@think> (raw)
In-Reply-To: <20101028133036.GA30565@elte.hu>

On Thu, Oct 28, 2010 at 03:30:36PM +0200, Ingo Molnar wrote:
> 
> "Many seconds freezes" and slowdowns wont be fixed via the VFS scalability patches 
> i'm afraid.
> 
> This has the appearance of some really bad IO or VM latency problem. Unfixed and 
> present in stable kernel versions going from years ago all the way to v2.6.36.

Hmmm, the workload you're describing here has two special parts.  First
it dramatically overloads the disk, and then it has guis doing things
waiting for the disk.

The virtualbox part of the workload is probably filling the queue with
huge amounts of synchronous random IO (I'm assuming it is going in via
O_DIRECT), and this will defeat any attempts from the filesystem to tell
the elevator "hey look, my IO is synchronous, please do hurry"

So, I'd try mounting ext4 in data=writeback mode.  I can't make ext4
stall fsyncs on non-fsync IO locally and it looks like they have solved
the ext3 data=ordered problem.  But I still like to rule out old and
known issues before we dig into new things.

I'd also suggest something like the below patch which is entirely
untested and must be blessed by an actual ext4 developer.  I think we
can make fsync faster if we put the mutex locking down in the FS, but
until then it should be ok to drop the mutex while we are doing the
expensive log commits:

diff --git a/fs/ext4/fsync.c b/fs/ext4/fsync.c
index 592adf2..1b7a637 100644
--- a/fs/ext4/fsync.c
+++ b/fs/ext4/fsync.c
@@ -114,6 +114,7 @@ int ext4_sync_file(struct file *file, int datasync)
 	if (ext4_should_journal_data(inode))
 		return ext4_force_commit(inode->i_sb);
 
+	mutex_unlock(&inode->i_mutex);
 	commit_tid = datasync ? ei->i_datasync_tid : ei->i_sync_tid;
 	if (jbd2_log_start_commit(journal, commit_tid)) {
 		/*
@@ -133,5 +134,7 @@ int ext4_sync_file(struct file *file, int datasync)
 	} else if (journal->j_flags & JBD2_BARRIER)
 		blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL,
 			BLKDEV_IFL_WAIT);
+
+	mutex_lock(&inode->i_mutex);
 	return ret;
 }



WARNING: multiple messages have this Message-ID (diff)
From: Chris Mason <chris.mason@oracle.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Pekka Enberg <penberg@kernel.org>,
	Aidar Kultayev <the.aidar@gmail.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jens Axboe <axboe@kernel.dk>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Nick Piggin <npiggin@suse.de>,
	Arjan van de Ven <arjan@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: 2.6.36 io bring the system to its knees
Date: Thu, 28 Oct 2010 13:01:32 -0400	[thread overview]
Message-ID: <20101028170132.GY27796@think> (raw)
In-Reply-To: <20101028133036.GA30565@elte.hu>

On Thu, Oct 28, 2010 at 03:30:36PM +0200, Ingo Molnar wrote:
> 
> "Many seconds freezes" and slowdowns wont be fixed via the VFS scalability patches 
> i'm afraid.
> 
> This has the appearance of some really bad IO or VM latency problem. Unfixed and 
> present in stable kernel versions going from years ago all the way to v2.6.36.

Hmmm, the workload you're describing here has two special parts.  First
it dramatically overloads the disk, and then it has guis doing things
waiting for the disk.

The virtualbox part of the workload is probably filling the queue with
huge amounts of synchronous random IO (I'm assuming it is going in via
O_DIRECT), and this will defeat any attempts from the filesystem to tell
the elevator "hey look, my IO is synchronous, please do hurry"

So, I'd try mounting ext4 in data=writeback mode.  I can't make ext4
stall fsyncs on non-fsync IO locally and it looks like they have solved
the ext3 data=ordered problem.  But I still like to rule out old and
known issues before we dig into new things.

I'd also suggest something like the below patch which is entirely
untested and must be blessed by an actual ext4 developer.  I think we
can make fsync faster if we put the mutex locking down in the FS, but
until then it should be ok to drop the mutex while we are doing the
expensive log commits:

diff --git a/fs/ext4/fsync.c b/fs/ext4/fsync.c
index 592adf2..1b7a637 100644
--- a/fs/ext4/fsync.c
+++ b/fs/ext4/fsync.c
@@ -114,6 +114,7 @@ int ext4_sync_file(struct file *file, int datasync)
 	if (ext4_should_journal_data(inode))
 		return ext4_force_commit(inode->i_sb);
 
+	mutex_unlock(&inode->i_mutex);
 	commit_tid = datasync ? ei->i_datasync_tid : ei->i_sync_tid;
 	if (jbd2_log_start_commit(journal, commit_tid)) {
 		/*
@@ -133,5 +134,7 @@ int ext4_sync_file(struct file *file, int datasync)
 	} else if (journal->j_flags & JBD2_BARRIER)
 		blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL,
 			BLKDEV_IFL_WAIT);
+
+	mutex_lock(&inode->i_mutex);
 	return ret;
 }


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-10-28 17:04 UTC|newest]

Thread overview: 130+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <AANLkTimt7wzR9RwGWbvhiOmot_zzayfCfSh_-v6yvuAP@mail.gmail.com>
2010-10-26 13:00 ` Fwd: 2.6.36 io bring the system to its knees Aidar Kultayev
     [not found]   ` <AANLkTinzJ9a+9w7G5X0uZpX2o-L8E6XW98VFKoF1R_-S@mail.gmail.com>
2010-10-28  6:09     ` Aidar Kultayev
2010-10-28  6:32       ` Pekka Enberg
2010-10-28  6:32         ` Pekka Enberg
2010-10-28  9:00         ` Ingo Molnar
2010-10-28  9:00           ` Ingo Molnar
2010-10-28  9:34           ` Pekka Enberg
2010-10-28  9:34             ` Pekka Enberg
2010-10-28 11:16           ` Pekka Enberg
2010-10-28 11:16             ` Pekka Enberg
2010-10-28 11:33             ` Aidar Kultayev
2010-10-28 11:33               ` Aidar Kultayev
2010-10-28 11:48               ` Pekka Enberg
2010-10-28 11:48                 ` Pekka Enberg
2010-10-28 12:18                 ` Aidar Kultayev
2010-10-28 12:18                   ` Aidar Kultayev
2010-10-28 13:46                 ` Christoph Hellwig
2010-10-28 13:46                   ` Christoph Hellwig
2010-10-28 13:54                   ` Ingo Molnar
2010-10-28 13:54                     ` Ingo Molnar
2010-10-28 13:30             ` Ingo Molnar
2010-10-28 13:30               ` Ingo Molnar
2010-10-28 13:47               ` Christoph Hellwig
2010-10-28 13:47                 ` Christoph Hellwig
2010-10-28 13:50                 ` Ingo Molnar
2010-10-28 13:50                   ` Ingo Molnar
2010-10-28 17:01               ` Chris Mason [this message]
2010-10-28 17:01                 ` Chris Mason
2010-10-28 17:57                 ` Pekka Enberg
2010-10-28 17:57                   ` Pekka Enberg
2010-10-29 14:52                   ` Ted Ts'o
2010-10-29 14:52                     ` Ted Ts'o
2010-10-29 15:33                     ` Aidar Kultayev
2010-10-29 15:33                       ` Aidar Kultayev
2010-10-30  9:14                       ` Ingo Molnar
2010-10-30  9:14                         ` Ingo Molnar
2010-10-30 13:02                         ` Aidar Kultayev
2010-10-30 13:02                           ` Aidar Kultayev
2010-10-30 19:06                           ` Chris Mason
2010-10-30 19:06                             ` Chris Mason
2010-10-31  2:31                           ` Ted Ts'o
2010-10-31  2:31                             ` Ted Ts'o
2010-10-31 17:49                             ` Corrado Zoccolo
2010-10-31 17:49                               ` Corrado Zoccolo
2010-11-02  3:10                           ` Shaohua Li
2010-11-02  3:10                             ` Shaohua Li
2010-11-02 11:47                 ` Sanjoy Mahajan
2010-11-02 11:47                   ` Sanjoy Mahajan
2010-11-02 13:12                   ` Chris Mason
2010-11-02 13:12                     ` Chris Mason
2010-11-04 16:05                     ` Sanjoy Mahajan
2010-11-04 16:05                       ` Sanjoy Mahajan
2010-11-04 23:35                       ` Steven Barrett
2010-11-04 23:35                         ` Steven Barrett
2010-11-04 23:44                 ` Jesper Juhl
2010-11-04 23:44                   ` Jesper Juhl
2010-11-04 23:48                   ` Jesper Juhl
2010-11-04 23:48                     ` Jesper Juhl
2010-11-05  1:43                     ` Dave Chinner
2010-11-05  1:43                       ` Dave Chinner
2010-11-05 12:48                       ` Sanjoy Mahajan
2010-11-05 12:48                         ` Sanjoy Mahajan
2010-11-06 14:10                         ` dave b
2010-11-06 14:10                           ` dave b
2010-11-06 15:12                           ` Dave Chinner
2010-11-06 15:12                             ` Dave Chinner
2010-11-07  6:06                             ` dave b
2010-11-07  6:06                               ` dave b
2010-11-07 12:08                           ` Jens Axboe
2010-11-07 12:08                             ` Jens Axboe
2010-11-07 15:50                             ` Linus Torvalds
2010-11-07 15:50                               ` Linus Torvalds
2010-11-10  1:32                               ` Dave Chinner
2010-11-10  1:32                                 ` Dave Chinner
2010-11-10  2:01                                 ` dave b
2010-11-10  2:01                                   ` dave b
2010-11-10  8:08                                 ` Evgeniy Ivanov
2010-11-10  8:08                                   ` Evgeniy Ivanov
2010-11-10  8:24                                   ` Dave Chinner
2010-11-10  8:24                                     ` Dave Chinner
2010-11-10 14:22                                     ` Pavel Machek
2010-11-10 14:22                                       ` Pavel Machek
2010-11-10 14:20                                 ` Pavel Machek
2010-11-10 14:20                                   ` Pavel Machek
2010-11-10 14:27                                   ` Ingo Molnar
2010-11-10 14:27                                     ` Ingo Molnar
2010-11-10 14:55                                     ` Christoph Hellwig
2010-11-10 14:55                                       ` Christoph Hellwig
2010-11-10 19:09                                       ` Pavel Machek
2010-11-10 19:09                                         ` Pavel Machek
2010-11-10 14:33                                 ` Theodore Tso
2010-11-10 14:33                                   ` Theodore Tso
2010-11-10 14:57                                   ` Christoph Hellwig
2010-11-10 14:57                                     ` Christoph Hellwig
2010-11-10 15:00                                     ` Chris Mason
2010-11-10 15:00                                       ` Chris Mason
2010-11-10 23:36                                   ` Dave Chinner
2010-11-10 23:36                                     ` Dave Chinner
2010-11-10 15:59                                 ` Linus Torvalds
2010-11-10 15:59                                   ` Linus Torvalds
2010-11-10 16:46                                   ` Alexey Dobriyan
2010-11-10 16:46                                     ` Alexey Dobriyan
2010-11-10 16:55                                     ` Linus Torvalds
2010-11-10 16:55                                       ` Linus Torvalds
2010-11-10 17:10                                       ` Alexey Dobriyan
2010-11-10 17:10                                         ` Alexey Dobriyan
2010-11-10 18:55                                         ` Mark Lord
2010-11-10 18:55                                           ` Mark Lord
2010-11-10 18:27                                     ` Mike Galbraith
2010-11-10 18:27                                       ` Mike Galbraith
2010-11-10 23:43                                   ` Dave Chinner
2010-11-10 23:43                                     ` Dave Chinner
2010-11-06 19:10                         ` Arjan van de Ven
2010-11-06 19:10                           ` Arjan van de Ven
2010-11-07 17:16                       ` Jesper Juhl
2010-11-07 17:16                         ` Jesper Juhl
2010-11-09 19:47                         ` Evgeniy Ivanov
2010-11-09 19:47                           ` Evgeniy Ivanov
2010-11-09 20:20                           ` Christoph Hellwig
2010-11-09 20:20                             ` Christoph Hellwig
2010-11-09 21:00                       ` Chris Mason
2010-11-09 21:00                         ` Chris Mason
2010-10-31  1:22       ` Wu Fengguang
2010-10-31  1:22         ` Wu Fengguang
2010-10-31  1:51         ` Wu Fengguang
2010-10-31  1:51           ` Wu Fengguang
2010-11-01  1:09           ` Dimitrios Apostolou
2010-11-01  1:09             ` Dimitrios Apostolou
2010-11-02  1:20             ` Wu Fengguang
2010-11-02  1:20               ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101028170132.GY27796@think \
    --to=chris.mason@oracle.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@infradead.org \
    --cc=axboe@kernel.dk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=penberg@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=the.aidar@gmail.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.