From: Chris Mason <chris.mason@oracle.com> To: Ingo Molnar <mingo@elte.hu> Cc: Pekka Enberg <penberg@kernel.org>, Aidar Kultayev <the.aidar@gmail.com>, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Linus Torvalds <torvalds@linux-foundation.org>, Andrew Morton <akpm@linux-foundation.org>, Jens Axboe <axboe@kernel.dk>, Peter Zijlstra <a.p.zijlstra@chello.nl>, Nick Piggin <npiggin@suse.de>, Arjan van de Ven <arjan@infradead.org>, Thomas Gleixner <tglx@linutronix.de> Subject: Re: 2.6.36 io bring the system to its knees Date: Thu, 28 Oct 2010 13:01:32 -0400 [thread overview] Message-ID: <20101028170132.GY27796@think> (raw) In-Reply-To: <20101028133036.GA30565@elte.hu> On Thu, Oct 28, 2010 at 03:30:36PM +0200, Ingo Molnar wrote: > > "Many seconds freezes" and slowdowns wont be fixed via the VFS scalability patches > i'm afraid. > > This has the appearance of some really bad IO or VM latency problem. Unfixed and > present in stable kernel versions going from years ago all the way to v2.6.36. Hmmm, the workload you're describing here has two special parts. First it dramatically overloads the disk, and then it has guis doing things waiting for the disk. The virtualbox part of the workload is probably filling the queue with huge amounts of synchronous random IO (I'm assuming it is going in via O_DIRECT), and this will defeat any attempts from the filesystem to tell the elevator "hey look, my IO is synchronous, please do hurry" So, I'd try mounting ext4 in data=writeback mode. I can't make ext4 stall fsyncs on non-fsync IO locally and it looks like they have solved the ext3 data=ordered problem. But I still like to rule out old and known issues before we dig into new things. I'd also suggest something like the below patch which is entirely untested and must be blessed by an actual ext4 developer. I think we can make fsync faster if we put the mutex locking down in the FS, but until then it should be ok to drop the mutex while we are doing the expensive log commits: diff --git a/fs/ext4/fsync.c b/fs/ext4/fsync.c index 592adf2..1b7a637 100644 --- a/fs/ext4/fsync.c +++ b/fs/ext4/fsync.c @@ -114,6 +114,7 @@ int ext4_sync_file(struct file *file, int datasync) if (ext4_should_journal_data(inode)) return ext4_force_commit(inode->i_sb); + mutex_unlock(&inode->i_mutex); commit_tid = datasync ? ei->i_datasync_tid : ei->i_sync_tid; if (jbd2_log_start_commit(journal, commit_tid)) { /* @@ -133,5 +134,7 @@ int ext4_sync_file(struct file *file, int datasync) } else if (journal->j_flags & JBD2_BARRIER) blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL, BLKDEV_IFL_WAIT); + + mutex_lock(&inode->i_mutex); return ret; }
WARNING: multiple messages have this Message-ID (diff)
From: Chris Mason <chris.mason@oracle.com> To: Ingo Molnar <mingo@elte.hu> Cc: Pekka Enberg <penberg@kernel.org>, Aidar Kultayev <the.aidar@gmail.com>, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Linus Torvalds <torvalds@linux-foundation.org>, Andrew Morton <akpm@linux-foundation.org>, Jens Axboe <axboe@kernel.dk>, Peter Zijlstra <a.p.zijlstra@chello.nl>, Nick Piggin <npiggin@suse.de>, Arjan van de Ven <arjan@infradead.org>, Thomas Gleixner <tglx@linutronix.de> Subject: Re: 2.6.36 io bring the system to its knees Date: Thu, 28 Oct 2010 13:01:32 -0400 [thread overview] Message-ID: <20101028170132.GY27796@think> (raw) In-Reply-To: <20101028133036.GA30565@elte.hu> On Thu, Oct 28, 2010 at 03:30:36PM +0200, Ingo Molnar wrote: > > "Many seconds freezes" and slowdowns wont be fixed via the VFS scalability patches > i'm afraid. > > This has the appearance of some really bad IO or VM latency problem. Unfixed and > present in stable kernel versions going from years ago all the way to v2.6.36. Hmmm, the workload you're describing here has two special parts. First it dramatically overloads the disk, and then it has guis doing things waiting for the disk. The virtualbox part of the workload is probably filling the queue with huge amounts of synchronous random IO (I'm assuming it is going in via O_DIRECT), and this will defeat any attempts from the filesystem to tell the elevator "hey look, my IO is synchronous, please do hurry" So, I'd try mounting ext4 in data=writeback mode. I can't make ext4 stall fsyncs on non-fsync IO locally and it looks like they have solved the ext3 data=ordered problem. But I still like to rule out old and known issues before we dig into new things. I'd also suggest something like the below patch which is entirely untested and must be blessed by an actual ext4 developer. I think we can make fsync faster if we put the mutex locking down in the FS, but until then it should be ok to drop the mutex while we are doing the expensive log commits: diff --git a/fs/ext4/fsync.c b/fs/ext4/fsync.c index 592adf2..1b7a637 100644 --- a/fs/ext4/fsync.c +++ b/fs/ext4/fsync.c @@ -114,6 +114,7 @@ int ext4_sync_file(struct file *file, int datasync) if (ext4_should_journal_data(inode)) return ext4_force_commit(inode->i_sb); + mutex_unlock(&inode->i_mutex); commit_tid = datasync ? ei->i_datasync_tid : ei->i_sync_tid; if (jbd2_log_start_commit(journal, commit_tid)) { /* @@ -133,5 +134,7 @@ int ext4_sync_file(struct file *file, int datasync) } else if (journal->j_flags & JBD2_BARRIER) blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL, BLKDEV_IFL_WAIT); + + mutex_lock(&inode->i_mutex); return ret; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-10-28 17:04 UTC|newest] Thread overview: 130+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <AANLkTimt7wzR9RwGWbvhiOmot_zzayfCfSh_-v6yvuAP@mail.gmail.com> 2010-10-26 13:00 ` Fwd: 2.6.36 io bring the system to its knees Aidar Kultayev [not found] ` <AANLkTinzJ9a+9w7G5X0uZpX2o-L8E6XW98VFKoF1R_-S@mail.gmail.com> 2010-10-28 6:09 ` Aidar Kultayev 2010-10-28 6:32 ` Pekka Enberg 2010-10-28 6:32 ` Pekka Enberg 2010-10-28 9:00 ` Ingo Molnar 2010-10-28 9:00 ` Ingo Molnar 2010-10-28 9:34 ` Pekka Enberg 2010-10-28 9:34 ` Pekka Enberg 2010-10-28 11:16 ` Pekka Enberg 2010-10-28 11:16 ` Pekka Enberg 2010-10-28 11:33 ` Aidar Kultayev 2010-10-28 11:33 ` Aidar Kultayev 2010-10-28 11:48 ` Pekka Enberg 2010-10-28 11:48 ` Pekka Enberg 2010-10-28 12:18 ` Aidar Kultayev 2010-10-28 12:18 ` Aidar Kultayev 2010-10-28 13:46 ` Christoph Hellwig 2010-10-28 13:46 ` Christoph Hellwig 2010-10-28 13:54 ` Ingo Molnar 2010-10-28 13:54 ` Ingo Molnar 2010-10-28 13:30 ` Ingo Molnar 2010-10-28 13:30 ` Ingo Molnar 2010-10-28 13:47 ` Christoph Hellwig 2010-10-28 13:47 ` Christoph Hellwig 2010-10-28 13:50 ` Ingo Molnar 2010-10-28 13:50 ` Ingo Molnar 2010-10-28 17:01 ` Chris Mason [this message] 2010-10-28 17:01 ` Chris Mason 2010-10-28 17:57 ` Pekka Enberg 2010-10-28 17:57 ` Pekka Enberg 2010-10-29 14:52 ` Ted Ts'o 2010-10-29 14:52 ` Ted Ts'o 2010-10-29 15:33 ` Aidar Kultayev 2010-10-29 15:33 ` Aidar Kultayev 2010-10-30 9:14 ` Ingo Molnar 2010-10-30 9:14 ` Ingo Molnar 2010-10-30 13:02 ` Aidar Kultayev 2010-10-30 13:02 ` Aidar Kultayev 2010-10-30 19:06 ` Chris Mason 2010-10-30 19:06 ` Chris Mason 2010-10-31 2:31 ` Ted Ts'o 2010-10-31 2:31 ` Ted Ts'o 2010-10-31 17:49 ` Corrado Zoccolo 2010-10-31 17:49 ` Corrado Zoccolo 2010-11-02 3:10 ` Shaohua Li 2010-11-02 3:10 ` Shaohua Li 2010-11-02 11:47 ` Sanjoy Mahajan 2010-11-02 11:47 ` Sanjoy Mahajan 2010-11-02 13:12 ` Chris Mason 2010-11-02 13:12 ` Chris Mason 2010-11-04 16:05 ` Sanjoy Mahajan 2010-11-04 16:05 ` Sanjoy Mahajan 2010-11-04 23:35 ` Steven Barrett 2010-11-04 23:35 ` Steven Barrett 2010-11-04 23:44 ` Jesper Juhl 2010-11-04 23:44 ` Jesper Juhl 2010-11-04 23:48 ` Jesper Juhl 2010-11-04 23:48 ` Jesper Juhl 2010-11-05 1:43 ` Dave Chinner 2010-11-05 1:43 ` Dave Chinner 2010-11-05 12:48 ` Sanjoy Mahajan 2010-11-05 12:48 ` Sanjoy Mahajan 2010-11-06 14:10 ` dave b 2010-11-06 14:10 ` dave b 2010-11-06 15:12 ` Dave Chinner 2010-11-06 15:12 ` Dave Chinner 2010-11-07 6:06 ` dave b 2010-11-07 6:06 ` dave b 2010-11-07 12:08 ` Jens Axboe 2010-11-07 12:08 ` Jens Axboe 2010-11-07 15:50 ` Linus Torvalds 2010-11-07 15:50 ` Linus Torvalds 2010-11-10 1:32 ` Dave Chinner 2010-11-10 1:32 ` Dave Chinner 2010-11-10 2:01 ` dave b 2010-11-10 2:01 ` dave b 2010-11-10 8:08 ` Evgeniy Ivanov 2010-11-10 8:08 ` Evgeniy Ivanov 2010-11-10 8:24 ` Dave Chinner 2010-11-10 8:24 ` Dave Chinner 2010-11-10 14:22 ` Pavel Machek 2010-11-10 14:22 ` Pavel Machek 2010-11-10 14:20 ` Pavel Machek 2010-11-10 14:20 ` Pavel Machek 2010-11-10 14:27 ` Ingo Molnar 2010-11-10 14:27 ` Ingo Molnar 2010-11-10 14:55 ` Christoph Hellwig 2010-11-10 14:55 ` Christoph Hellwig 2010-11-10 19:09 ` Pavel Machek 2010-11-10 19:09 ` Pavel Machek 2010-11-10 14:33 ` Theodore Tso 2010-11-10 14:33 ` Theodore Tso 2010-11-10 14:57 ` Christoph Hellwig 2010-11-10 14:57 ` Christoph Hellwig 2010-11-10 15:00 ` Chris Mason 2010-11-10 15:00 ` Chris Mason 2010-11-10 23:36 ` Dave Chinner 2010-11-10 23:36 ` Dave Chinner 2010-11-10 15:59 ` Linus Torvalds 2010-11-10 15:59 ` Linus Torvalds 2010-11-10 16:46 ` Alexey Dobriyan 2010-11-10 16:46 ` Alexey Dobriyan 2010-11-10 16:55 ` Linus Torvalds 2010-11-10 16:55 ` Linus Torvalds 2010-11-10 17:10 ` Alexey Dobriyan 2010-11-10 17:10 ` Alexey Dobriyan 2010-11-10 18:55 ` Mark Lord 2010-11-10 18:55 ` Mark Lord 2010-11-10 18:27 ` Mike Galbraith 2010-11-10 18:27 ` Mike Galbraith 2010-11-10 23:43 ` Dave Chinner 2010-11-10 23:43 ` Dave Chinner 2010-11-06 19:10 ` Arjan van de Ven 2010-11-06 19:10 ` Arjan van de Ven 2010-11-07 17:16 ` Jesper Juhl 2010-11-07 17:16 ` Jesper Juhl 2010-11-09 19:47 ` Evgeniy Ivanov 2010-11-09 19:47 ` Evgeniy Ivanov 2010-11-09 20:20 ` Christoph Hellwig 2010-11-09 20:20 ` Christoph Hellwig 2010-11-09 21:00 ` Chris Mason 2010-11-09 21:00 ` Chris Mason 2010-10-31 1:22 ` Wu Fengguang 2010-10-31 1:22 ` Wu Fengguang 2010-10-31 1:51 ` Wu Fengguang 2010-10-31 1:51 ` Wu Fengguang 2010-11-01 1:09 ` Dimitrios Apostolou 2010-11-01 1:09 ` Dimitrios Apostolou 2010-11-02 1:20 ` Wu Fengguang 2010-11-02 1:20 ` Wu Fengguang
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20101028170132.GY27796@think \ --to=chris.mason@oracle.com \ --cc=a.p.zijlstra@chello.nl \ --cc=akpm@linux-foundation.org \ --cc=arjan@infradead.org \ --cc=axboe@kernel.dk \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mingo@elte.hu \ --cc=npiggin@suse.de \ --cc=penberg@kernel.org \ --cc=tglx@linutronix.de \ --cc=the.aidar@gmail.com \ --cc=torvalds@linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.