From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933935Ab0J1REx (ORCPT ); Thu, 28 Oct 2010 13:04:53 -0400 Received: from rcsinet10.oracle.com ([148.87.113.121]:33268 "EHLO rcsinet10.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760517Ab0J1REp (ORCPT ); Thu, 28 Oct 2010 13:04:45 -0400 Date: Thu, 28 Oct 2010 13:01:32 -0400 From: Chris Mason To: Ingo Molnar Cc: Pekka Enberg , Aidar Kultayev , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Linus Torvalds , Andrew Morton , Jens Axboe , Peter Zijlstra , Nick Piggin , Arjan van de Ven , Thomas Gleixner Subject: Re: 2.6.36 io bring the system to its knees Message-ID: <20101028170132.GY27796@think> Mail-Followup-To: Chris Mason , Ingo Molnar , Pekka Enberg , Aidar Kultayev , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Linus Torvalds , Andrew Morton , Jens Axboe , Peter Zijlstra , Nick Piggin , Arjan van de Ven , Thomas Gleixner References: <20101028090002.GA12446@elte.hu> <20101028133036.GA30565@elte.hu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20101028133036.GA30565@elte.hu> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 28, 2010 at 03:30:36PM +0200, Ingo Molnar wrote: > > "Many seconds freezes" and slowdowns wont be fixed via the VFS scalability patches > i'm afraid. > > This has the appearance of some really bad IO or VM latency problem. Unfixed and > present in stable kernel versions going from years ago all the way to v2.6.36. Hmmm, the workload you're describing here has two special parts. First it dramatically overloads the disk, and then it has guis doing things waiting for the disk. The virtualbox part of the workload is probably filling the queue with huge amounts of synchronous random IO (I'm assuming it is going in via O_DIRECT), and this will defeat any attempts from the filesystem to tell the elevator "hey look, my IO is synchronous, please do hurry" So, I'd try mounting ext4 in data=writeback mode. I can't make ext4 stall fsyncs on non-fsync IO locally and it looks like they have solved the ext3 data=ordered problem. But I still like to rule out old and known issues before we dig into new things. I'd also suggest something like the below patch which is entirely untested and must be blessed by an actual ext4 developer. I think we can make fsync faster if we put the mutex locking down in the FS, but until then it should be ok to drop the mutex while we are doing the expensive log commits: diff --git a/fs/ext4/fsync.c b/fs/ext4/fsync.c index 592adf2..1b7a637 100644 --- a/fs/ext4/fsync.c +++ b/fs/ext4/fsync.c @@ -114,6 +114,7 @@ int ext4_sync_file(struct file *file, int datasync) if (ext4_should_journal_data(inode)) return ext4_force_commit(inode->i_sb); + mutex_unlock(&inode->i_mutex); commit_tid = datasync ? ei->i_datasync_tid : ei->i_sync_tid; if (jbd2_log_start_commit(journal, commit_tid)) { /* @@ -133,5 +134,7 @@ int ext4_sync_file(struct file *file, int datasync) } else if (journal->j_flags & JBD2_BARRIER) blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL, BLKDEV_IFL_WAIT); + + mutex_lock(&inode->i_mutex); return ret; } From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail138.messagelabs.com (mail138.messagelabs.com [216.82.249.35]) by kanga.kvack.org (Postfix) with ESMTP id B21478D0015 for ; Thu, 28 Oct 2010 13:04:29 -0400 (EDT) Date: Thu, 28 Oct 2010 13:01:32 -0400 From: Chris Mason Subject: Re: 2.6.36 io bring the system to its knees Message-ID: <20101028170132.GY27796@think> References: <20101028090002.GA12446@elte.hu> <20101028133036.GA30565@elte.hu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20101028133036.GA30565@elte.hu> Sender: owner-linux-mm@kvack.org To: Ingo Molnar Cc: Pekka Enberg , Aidar Kultayev , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Linus Torvalds , Andrew Morton , Jens Axboe , Peter Zijlstra , Nick Piggin , Arjan van de Ven , Thomas Gleixner List-ID: On Thu, Oct 28, 2010 at 03:30:36PM +0200, Ingo Molnar wrote: > > "Many seconds freezes" and slowdowns wont be fixed via the VFS scalability patches > i'm afraid. > > This has the appearance of some really bad IO or VM latency problem. Unfixed and > present in stable kernel versions going from years ago all the way to v2.6.36. Hmmm, the workload you're describing here has two special parts. First it dramatically overloads the disk, and then it has guis doing things waiting for the disk. The virtualbox part of the workload is probably filling the queue with huge amounts of synchronous random IO (I'm assuming it is going in via O_DIRECT), and this will defeat any attempts from the filesystem to tell the elevator "hey look, my IO is synchronous, please do hurry" So, I'd try mounting ext4 in data=writeback mode. I can't make ext4 stall fsyncs on non-fsync IO locally and it looks like they have solved the ext3 data=ordered problem. But I still like to rule out old and known issues before we dig into new things. I'd also suggest something like the below patch which is entirely untested and must be blessed by an actual ext4 developer. I think we can make fsync faster if we put the mutex locking down in the FS, but until then it should be ok to drop the mutex while we are doing the expensive log commits: diff --git a/fs/ext4/fsync.c b/fs/ext4/fsync.c index 592adf2..1b7a637 100644 --- a/fs/ext4/fsync.c +++ b/fs/ext4/fsync.c @@ -114,6 +114,7 @@ int ext4_sync_file(struct file *file, int datasync) if (ext4_should_journal_data(inode)) return ext4_force_commit(inode->i_sb); + mutex_unlock(&inode->i_mutex); commit_tid = datasync ? ei->i_datasync_tid : ei->i_sync_tid; if (jbd2_log_start_commit(journal, commit_tid)) { /* @@ -133,5 +134,7 @@ int ext4_sync_file(struct file *file, int datasync) } else if (journal->j_flags & JBD2_BARRIER) blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL, BLKDEV_IFL_WAIT); + + mutex_lock(&inode->i_mutex); return ret; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org