From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753501Ab1I2LFb (ORCPT ); Thu, 29 Sep 2011 07:05:31 -0400 Received: from mga14.intel.com ([143.182.124.37]:56835 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751701Ab1I2LFa (ORCPT ); Thu, 29 Sep 2011 07:05:30 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.68,460,1312182000"; d="scan'208";a="56911725" Date: Thu, 29 Sep 2011 19:05:25 +0800 From: Wu Fengguang To: Peter Zijlstra Cc: "linux-fsdevel@vger.kernel.org" , Andrew Morton , Jan Kara , Christoph Hellwig , Dave Chinner , Greg Thelen , Minchan Kim , Vivek Goyal , Andrea Righi , linux-mm , LKML Subject: Re: [PATCH 10/18] writeback: dirty position control - bdi reserve area Message-ID: <20110929110525.GA10979@localhost> References: <1315318179.14232.3.camel@twins> <20110907123108.GB6862@localhost> <1315822779.26517.23.camel@twins> <20110918141705.GB15366@localhost> <20110918143721.GA17240@localhost> <20110918144751.GA18645@localhost> <20110928140205.GA26617@localhost> <1317221435.24040.39.camel@twins> <20110929033201.GA21722@localhost> <1317286197.22581.4.camel@twins> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1317286197.22581.4.camel@twins> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 29, 2011 at 04:49:57PM +0800, Peter Zijlstra wrote: > On Thu, 2011-09-29 at 11:32 +0800, Wu Fengguang wrote: > > > Now I guess the only problem is when nr_bdi * MIN_WRITEBACK_PAGES ~ > > > limit, at which point things go pear shaped. > > > > Yes. In that case the global @dirty will always be drove up to @limit. > > Once @dirty dropped reasonably below, whichever bdi task wakeup first > > will take the chance to fill the gap, which is not fair for bdi's of > > different speed. > > > > Let me retry the thresh=1M,10M test cases without MIN_WRITEBACK_PAGES. > > Hopefully the removal of it won't impact performance a lot. > > > Right, so alternatively we could try an argument that this is > sufficiently rare and shouldn't happen. People with lots of disks tend > to also have lots of memory, etc. Right. > If we do find it happens we can always look at it again. Sure. Now I got the results for single disk thresh=1M,8M,100M cases and find no big differences if removing MIN_WRITEBACK_PAGES: 3.1.0-rc4-bgthresh3+ 3.1.0-rc4-bgthresh4+ ------------------------ ------------------------ 3988742 +1.9% 4063217 thresh=100M/ext4-10dd-4k-8p-4096M-100M:10-X 4758884 +1.5% 4829320 thresh=100M/ext4-1dd-4k-8p-4096M-100M:10-X 4621240 +1.6% 4693525 thresh=100M/ext4-2dd-4k-8p-4096M-100M:10-X 3420717 +0.1% 3423712 thresh=100M/xfs-10dd-4k-8p-4096M-100M:10-X 4361830 +1.4% 4423554 thresh=100M/xfs-1dd-4k-8p-4096M-100M:10-X 3964043 +0.2% 3972057 thresh=100M/xfs-2dd-4k-8p-4096M-100M:10-X 2937926 +0.6% 2956870 thresh=1M/ext4-10dd-4k-8p-4096M-1M:10-X 4472552 -1.9% 4387457 thresh=1M/ext4-1dd-4k-8p-4096M-1M:10-X 4085707 -3.0% 3961155 thresh=1M/ext4-2dd-4k-8p-4096M-1M:10-X 2206897 +2.1% 2253839 thresh=1M/xfs-10dd-4k-8p-4096M-1M:10-X 4207336 -2.1% 4119821 thresh=1M/xfs-1dd-4k-8p-4096M-1M:10-X 3739888 -3.6% 3604315 thresh=1M/xfs-2dd-4k-8p-4096M-1M:10-X 3279302 -0.2% 3273310 thresh=8M/ext4-10dd-4k-8p-4096M-8M:10-X 4834878 +1.6% 4912372 thresh=8M/ext4-1dd-4k-8p-4096M-8M:10-X 4511120 -1.7% 4435193 thresh=8M/ext4-2dd-4k-8p-4096M-8M:10-X 2443874 -0.5% 2432188 thresh=8M/xfs-10dd-4k-8p-4096M-8M:10-X 4308416 -0.6% 4283110 thresh=8M/xfs-1dd-4k-8p-4096M-8M:10-X 3739810 +0.6% 3763320 thresh=8M/xfs-2dd-4k-8p-4096M-8M:10-X Or lowering the largest promotion ratio from 128 to 8: 3.1.0-rc4-bgthresh4+ 3.1.0-rc4-bgthresh5+ ------------------------ ------------------------ 4063217 -0.0% 4062022 thresh=100M/ext4-10dd-4k-8p-4096M-100M:10-X 4829320 +1.1% 4882829 thresh=100M/ext4-1dd-4k-8p-4096M-100M:10-X 4693525 +0.1% 4700537 thresh=100M/ext4-2dd-4k-8p-4096M-100M:10-X 3423712 +0.2% 3431603 thresh=100M/xfs-10dd-4k-8p-4096M-100M:10-X 4423554 -0.3% 4408912 thresh=100M/xfs-1dd-4k-8p-4096M-100M:10-X 3972057 -0.1% 3968535 thresh=100M/xfs-2dd-4k-8p-4096M-100M:10-X 2956870 -0.9% 2929605 thresh=1M/ext4-10dd-4k-8p-4096M-1M:10-X 4387457 -0.2% 4378233 thresh=1M/ext4-1dd-4k-8p-4096M-1M:10-X 3961155 -0.5% 3940075 thresh=1M/ext4-2dd-4k-8p-4096M-1M:10-X 2253839 -0.9% 2232976 thresh=1M/xfs-10dd-4k-8p-4096M-1M:10-X 4119821 -2.1% 4031983 thresh=1M/xfs-1dd-4k-8p-4096M-1M:10-X 3604315 -3.1% 3493042 thresh=1M/xfs-2dd-4k-8p-4096M-1M:10-X 3273310 -1.1% 3237060 thresh=8M/ext4-10dd-4k-8p-4096M-8M:10-X 4912372 -0.0% 4911287 thresh=8M/ext4-1dd-4k-8p-4096M-8M:10-X 4435193 +0.1% 4441581 thresh=8M/ext4-2dd-4k-8p-4096M-8M:10-X 2432188 +1.1% 2459249 thresh=8M/xfs-10dd-4k-8p-4096M-8M:10-X 4283110 +0.1% 4289456 thresh=8M/xfs-1dd-4k-8p-4096M-8M:10-X 3763320 -0.1% 3758938 thresh=8M/xfs-2dd-4k-8p-4096M-8M:10-X As for the thresh=100M JBOD cases, I don't see much occurrences of promotion ratio > 2. So the simplification should make no difference, too. Thus the finalized code will be: + x_intercept = bdi_thresh / 2; + if (bdi_dirty < x_intercept) { + if (bdi_dirty > x_intercept / 8) { + pos_ratio *= x_intercept; + do_div(pos_ratio, bdi_dirty); + } else + pos_ratio *= 8; + } Thanks, Fengguang From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wu Fengguang Subject: Re: [PATCH 10/18] writeback: dirty position control - bdi reserve area Date: Thu, 29 Sep 2011 19:05:25 +0800 Message-ID: <20110929110525.GA10979@localhost> References: <1315318179.14232.3.camel@twins> <20110907123108.GB6862@localhost> <1315822779.26517.23.camel@twins> <20110918141705.GB15366@localhost> <20110918143721.GA17240@localhost> <20110918144751.GA18645@localhost> <20110928140205.GA26617@localhost> <1317221435.24040.39.camel@twins> <20110929033201.GA21722@localhost> <1317286197.22581.4.camel@twins> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "linux-fsdevel@vger.kernel.org" , Andrew Morton , Jan Kara , Christoph Hellwig , Dave Chinner , Greg Thelen , Minchan Kim , Vivek Goyal , Andrea Righi , linux-mm , LKML To: Peter Zijlstra Return-path: Content-Disposition: inline In-Reply-To: <1317286197.22581.4.camel@twins> Sender: owner-linux-mm@kvack.org List-Id: linux-fsdevel.vger.kernel.org On Thu, Sep 29, 2011 at 04:49:57PM +0800, Peter Zijlstra wrote: > On Thu, 2011-09-29 at 11:32 +0800, Wu Fengguang wrote: > > > Now I guess the only problem is when nr_bdi * MIN_WRITEBACK_PAGES ~ > > > limit, at which point things go pear shaped. > > > > Yes. In that case the global @dirty will always be drove up to @limit. > > Once @dirty dropped reasonably below, whichever bdi task wakeup first > > will take the chance to fill the gap, which is not fair for bdi's of > > different speed. > > > > Let me retry the thresh=1M,10M test cases without MIN_WRITEBACK_PAGES. > > Hopefully the removal of it won't impact performance a lot. > > > Right, so alternatively we could try an argument that this is > sufficiently rare and shouldn't happen. People with lots of disks tend > to also have lots of memory, etc. Right. > If we do find it happens we can always look at it again. Sure. Now I got the results for single disk thresh=1M,8M,100M cases and find no big differences if removing MIN_WRITEBACK_PAGES: 3.1.0-rc4-bgthresh3+ 3.1.0-rc4-bgthresh4+ ------------------------ ------------------------ 3988742 +1.9% 4063217 thresh=100M/ext4-10dd-4k-8p-4096M-100M:10-X 4758884 +1.5% 4829320 thresh=100M/ext4-1dd-4k-8p-4096M-100M:10-X 4621240 +1.6% 4693525 thresh=100M/ext4-2dd-4k-8p-4096M-100M:10-X 3420717 +0.1% 3423712 thresh=100M/xfs-10dd-4k-8p-4096M-100M:10-X 4361830 +1.4% 4423554 thresh=100M/xfs-1dd-4k-8p-4096M-100M:10-X 3964043 +0.2% 3972057 thresh=100M/xfs-2dd-4k-8p-4096M-100M:10-X 2937926 +0.6% 2956870 thresh=1M/ext4-10dd-4k-8p-4096M-1M:10-X 4472552 -1.9% 4387457 thresh=1M/ext4-1dd-4k-8p-4096M-1M:10-X 4085707 -3.0% 3961155 thresh=1M/ext4-2dd-4k-8p-4096M-1M:10-X 2206897 +2.1% 2253839 thresh=1M/xfs-10dd-4k-8p-4096M-1M:10-X 4207336 -2.1% 4119821 thresh=1M/xfs-1dd-4k-8p-4096M-1M:10-X 3739888 -3.6% 3604315 thresh=1M/xfs-2dd-4k-8p-4096M-1M:10-X 3279302 -0.2% 3273310 thresh=8M/ext4-10dd-4k-8p-4096M-8M:10-X 4834878 +1.6% 4912372 thresh=8M/ext4-1dd-4k-8p-4096M-8M:10-X 4511120 -1.7% 4435193 thresh=8M/ext4-2dd-4k-8p-4096M-8M:10-X 2443874 -0.5% 2432188 thresh=8M/xfs-10dd-4k-8p-4096M-8M:10-X 4308416 -0.6% 4283110 thresh=8M/xfs-1dd-4k-8p-4096M-8M:10-X 3739810 +0.6% 3763320 thresh=8M/xfs-2dd-4k-8p-4096M-8M:10-X Or lowering the largest promotion ratio from 128 to 8: 3.1.0-rc4-bgthresh4+ 3.1.0-rc4-bgthresh5+ ------------------------ ------------------------ 4063217 -0.0% 4062022 thresh=100M/ext4-10dd-4k-8p-4096M-100M:10-X 4829320 +1.1% 4882829 thresh=100M/ext4-1dd-4k-8p-4096M-100M:10-X 4693525 +0.1% 4700537 thresh=100M/ext4-2dd-4k-8p-4096M-100M:10-X 3423712 +0.2% 3431603 thresh=100M/xfs-10dd-4k-8p-4096M-100M:10-X 4423554 -0.3% 4408912 thresh=100M/xfs-1dd-4k-8p-4096M-100M:10-X 3972057 -0.1% 3968535 thresh=100M/xfs-2dd-4k-8p-4096M-100M:10-X 2956870 -0.9% 2929605 thresh=1M/ext4-10dd-4k-8p-4096M-1M:10-X 4387457 -0.2% 4378233 thresh=1M/ext4-1dd-4k-8p-4096M-1M:10-X 3961155 -0.5% 3940075 thresh=1M/ext4-2dd-4k-8p-4096M-1M:10-X 2253839 -0.9% 2232976 thresh=1M/xfs-10dd-4k-8p-4096M-1M:10-X 4119821 -2.1% 4031983 thresh=1M/xfs-1dd-4k-8p-4096M-1M:10-X 3604315 -3.1% 3493042 thresh=1M/xfs-2dd-4k-8p-4096M-1M:10-X 3273310 -1.1% 3237060 thresh=8M/ext4-10dd-4k-8p-4096M-8M:10-X 4912372 -0.0% 4911287 thresh=8M/ext4-1dd-4k-8p-4096M-8M:10-X 4435193 +0.1% 4441581 thresh=8M/ext4-2dd-4k-8p-4096M-8M:10-X 2432188 +1.1% 2459249 thresh=8M/xfs-10dd-4k-8p-4096M-8M:10-X 4283110 +0.1% 4289456 thresh=8M/xfs-1dd-4k-8p-4096M-8M:10-X 3763320 -0.1% 3758938 thresh=8M/xfs-2dd-4k-8p-4096M-8M:10-X As for the thresh=100M JBOD cases, I don't see much occurrences of promotion ratio > 2. So the simplification should make no difference, too. Thus the finalized code will be: + x_intercept = bdi_thresh / 2; + if (bdi_dirty < x_intercept) { + if (bdi_dirty > x_intercept / 8) { + pos_ratio *= x_intercept; + do_div(pos_ratio, bdi_dirty); + } else + pos_ratio *= 8; + } Thanks, Fengguang -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org