From: Vivek Goyal <vgoyal@redhat.com> To: Wu Fengguang <fengguang.wu@intel.com> Cc: linux-fsdevel@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>, Jan Kara <jack@suse.cz>, Christoph Hellwig <hch@lst.de>, Dave Chinner <david@fromorbit.com>, Greg Thelen <gthelen@google.com>, Minchan Kim <minchan.kim@gmail.com>, Andrea Righi <arighi@develer.com>, linux-mm <linux-mm@kvack.org>, LKML <linux-kernel@vger.kernel.org> Subject: Re: [PATCH 0/5] IO-less dirty throttling v8 Date: Mon, 8 Aug 2011 22:01:27 -0400 [thread overview] Message-ID: <20110809020127.GA3700@redhat.com> (raw) In-Reply-To: <20110806084447.388624428@intel.com> On Sat, Aug 06, 2011 at 04:44:47PM +0800, Wu Fengguang wrote: > Hi all, > > The _core_ bits of the IO-less balance_dirty_pages(). > Heavily simplified and re-commented to make it easier to review. > > git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback.git dirty-throttling-v8 > > Only the bare minimal algorithms are presented, so you will find some rough > edges in the graphs below. But it's usable :) > > http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v8/ > > And an introduction to the (more complete) algorithms: > > http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/slides/smooth-dirty-throttling.pdf > > Questions and reviews are highly appreciated! Hi Wu, I am going through the slide number 39 where you talk about it being future proof and it can be used for IO control purposes. You have listed following merits of this approach. * per-bdi nature, works on NFS and Software RAID * no delayed response (working at the right layer) * no page tracking, hence decoupled from memcg * no interactions with FS and CFQ * get proportional IO controller for free * reuse/inherit all the base facilities/functions I would say that it will also be a good idea to list the demerits of this approach in current form and that is that it only deals with controlling buffered write IO and nothing else. So on the same block device, other direct writes might be going on from same group and in this scheme a user will not have any control. Another disadvantage is that throttling at page cache level does not take care of IO spikes at device level. Now I think one could probably come up with more sophisticated scheme where throttling is done at bdi level but is also accounted at device level at IO controller. (Something similar I had done in the past but Dave Chinner did not like it). Anyway, keeping track of per cgroup rate and throttling accordingly can definitely help implement an algorithm for per cgroup IO control. We probably just need to find a reasonable way to account all this IO to end device so that we have control of all kind of IO of a cgroup. How do you implement proportional control here? From overall bdi bandwidth vary per cgroup bandwidth regularly based on cgroup weight? Again the issue here is that it controls only buffered WRITES and nothing else and in this case co-ordinating with CFQ will probably be hard. So I guess usage of proportional IO just for buffered WRITES will have limited usage. Thanks Vivek > > shortlog: > > Wu Fengguang (5): > writeback: account per-bdi accumulated dirtied pages > writeback: dirty position control > writeback: dirty rate control > writeback: per task dirty rate limit > writeback: IO-less balance_dirty_pages() > > The last 4 patches are one single logical change, but splitted here to > make it easier to review the different parts of the algorithm. > > diffstat: > > include/linux/backing-dev.h | 8 + > include/linux/sched.h | 7 + > include/trace/events/writeback.h | 24 -- > mm/backing-dev.c | 3 + > mm/memory_hotplug.c | 3 - > mm/page-writeback.c | 459 ++++++++++++++++++++++---------------- > 6 files changed, 290 insertions(+), 214 deletions(-) > > Thanks, > Fengguang
WARNING: multiple messages have this Message-ID (diff)
From: Vivek Goyal <vgoyal@redhat.com> To: Wu Fengguang <fengguang.wu@intel.com> Cc: linux-fsdevel@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>, Jan Kara <jack@suse.cz>, Christoph Hellwig <hch@lst.de>, Dave Chinner <david@fromorbit.com>, Greg Thelen <gthelen@google.com>, Minchan Kim <minchan.kim@gmail.com>, Andrea Righi <arighi@develer.com>, linux-mm <linux-mm@kvack.org>, LKML <linux-kernel@vger.kernel.org> Subject: Re: [PATCH 0/5] IO-less dirty throttling v8 Date: Mon, 8 Aug 2011 22:01:27 -0400 [thread overview] Message-ID: <20110809020127.GA3700@redhat.com> (raw) In-Reply-To: <20110806084447.388624428@intel.com> On Sat, Aug 06, 2011 at 04:44:47PM +0800, Wu Fengguang wrote: > Hi all, > > The _core_ bits of the IO-less balance_dirty_pages(). > Heavily simplified and re-commented to make it easier to review. > > git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback.git dirty-throttling-v8 > > Only the bare minimal algorithms are presented, so you will find some rough > edges in the graphs below. But it's usable :) > > http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/dirty-throttling-v8/ > > And an introduction to the (more complete) algorithms: > > http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/slides/smooth-dirty-throttling.pdf > > Questions and reviews are highly appreciated! Hi Wu, I am going through the slide number 39 where you talk about it being future proof and it can be used for IO control purposes. You have listed following merits of this approach. * per-bdi nature, works on NFS and Software RAID * no delayed response (working at the right layer) * no page tracking, hence decoupled from memcg * no interactions with FS and CFQ * get proportional IO controller for free * reuse/inherit all the base facilities/functions I would say that it will also be a good idea to list the demerits of this approach in current form and that is that it only deals with controlling buffered write IO and nothing else. So on the same block device, other direct writes might be going on from same group and in this scheme a user will not have any control. Another disadvantage is that throttling at page cache level does not take care of IO spikes at device level. Now I think one could probably come up with more sophisticated scheme where throttling is done at bdi level but is also accounted at device level at IO controller. (Something similar I had done in the past but Dave Chinner did not like it). Anyway, keeping track of per cgroup rate and throttling accordingly can definitely help implement an algorithm for per cgroup IO control. We probably just need to find a reasonable way to account all this IO to end device so that we have control of all kind of IO of a cgroup. How do you implement proportional control here? From overall bdi bandwidth vary per cgroup bandwidth regularly based on cgroup weight? Again the issue here is that it controls only buffered WRITES and nothing else and in this case co-ordinating with CFQ will probably be hard. So I guess usage of proportional IO just for buffered WRITES will have limited usage. Thanks Vivek > > shortlog: > > Wu Fengguang (5): > writeback: account per-bdi accumulated dirtied pages > writeback: dirty position control > writeback: dirty rate control > writeback: per task dirty rate limit > writeback: IO-less balance_dirty_pages() > > The last 4 patches are one single logical change, but splitted here to > make it easier to review the different parts of the algorithm. > > diffstat: > > include/linux/backing-dev.h | 8 + > include/linux/sched.h | 7 + > include/trace/events/writeback.h | 24 -- > mm/backing-dev.c | 3 + > mm/memory_hotplug.c | 3 - > mm/page-writeback.c | 459 ++++++++++++++++++++++---------------- > 6 files changed, 290 insertions(+), 214 deletions(-) > > Thanks, > Fengguang -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-08-09 2:02 UTC|newest] Thread overview: 280+ messages / expand[flat|nested] mbox.gz Atom feed top 2011-08-06 8:44 [PATCH 0/5] IO-less dirty throttling v8 Wu Fengguang 2011-08-06 8:44 ` Wu Fengguang 2011-08-06 8:44 ` Wu Fengguang 2011-08-06 8:44 ` [PATCH 1/5] writeback: account per-bdi accumulated dirtied pages Wu Fengguang 2011-08-06 8:44 ` Wu Fengguang 2011-08-06 8:44 ` Wu Fengguang 2011-08-06 8:44 ` [PATCH 2/5] writeback: dirty position control Wu Fengguang 2011-08-06 8:44 ` Wu Fengguang 2011-08-06 8:44 ` Wu Fengguang 2011-08-08 13:46 ` Peter Zijlstra 2011-08-08 13:46 ` Peter Zijlstra 2011-08-08 13:46 ` Peter Zijlstra 2011-08-08 14:11 ` Wu Fengguang 2011-08-08 14:11 ` Wu Fengguang 2011-08-08 14:31 ` Peter Zijlstra 2011-08-08 14:31 ` Peter Zijlstra 2011-08-08 14:31 ` Peter Zijlstra 2011-08-08 22:47 ` Wu Fengguang 2011-08-08 22:47 ` Wu Fengguang 2011-08-09 9:31 ` Peter Zijlstra 2011-08-09 9:31 ` Peter Zijlstra 2011-08-09 9:31 ` Peter Zijlstra 2011-08-10 12:28 ` Wu Fengguang 2011-08-10 12:28 ` Wu Fengguang 2011-08-08 14:41 ` Peter Zijlstra 2011-08-08 14:41 ` Peter Zijlstra 2011-08-08 14:41 ` Peter Zijlstra 2011-08-08 23:05 ` Wu Fengguang 2011-08-08 23:05 ` Wu Fengguang 2011-08-09 10:32 ` Peter Zijlstra 2011-08-09 10:32 ` Peter Zijlstra 2011-08-09 10:32 ` Peter Zijlstra 2011-08-09 17:20 ` Peter Zijlstra 2011-08-09 17:20 ` Peter Zijlstra 2011-08-09 17:20 ` Peter Zijlstra 2011-08-10 22:34 ` Jan Kara 2011-08-10 22:34 ` Jan Kara 2011-08-11 2:29 ` Wu Fengguang 2011-08-11 2:29 ` Wu Fengguang 2011-08-11 11:14 ` Jan Kara 2011-08-11 11:14 ` Jan Kara 2011-08-16 8:35 ` Wu Fengguang 2011-08-16 8:35 ` Wu Fengguang 2011-08-12 13:19 ` Wu Fengguang 2011-08-12 13:19 ` Wu Fengguang 2011-08-10 21:40 ` Vivek Goyal 2011-08-10 21:40 ` Vivek Goyal 2011-08-16 8:55 ` Wu Fengguang 2011-08-16 8:55 ` Wu Fengguang 2011-08-11 22:56 ` Peter Zijlstra 2011-08-11 22:56 ` Peter Zijlstra 2011-08-11 22:56 ` Peter Zijlstra 2011-08-12 2:43 ` Wu Fengguang 2011-08-12 2:43 ` Wu Fengguang 2011-08-12 3:18 ` Wu Fengguang 2011-08-12 5:45 ` Wu Fengguang 2011-08-12 5:45 ` Wu Fengguang 2011-08-12 9:45 ` Peter Zijlstra 2011-08-12 9:45 ` Peter Zijlstra 2011-08-12 9:45 ` Peter Zijlstra 2011-08-12 11:07 ` Wu Fengguang 2011-08-12 11:07 ` Wu Fengguang 2011-08-12 12:17 ` Peter Zijlstra 2011-08-12 12:17 ` Peter Zijlstra 2011-08-12 12:17 ` Peter Zijlstra 2011-08-12 9:47 ` Peter Zijlstra 2011-08-12 9:47 ` Peter Zijlstra 2011-08-12 9:47 ` Peter Zijlstra 2011-08-12 11:11 ` Wu Fengguang 2011-08-12 11:11 ` Wu Fengguang 2011-08-12 12:54 ` Peter Zijlstra 2011-08-12 12:54 ` Peter Zijlstra 2011-08-12 12:54 ` Peter Zijlstra 2011-08-12 12:59 ` Wu Fengguang 2011-08-12 12:59 ` Wu Fengguang 2011-08-12 13:08 ` Peter Zijlstra 2011-08-12 13:08 ` Peter Zijlstra 2011-08-12 13:08 ` Peter Zijlstra 2011-08-12 13:04 ` Peter Zijlstra 2011-08-12 13:04 ` Peter Zijlstra 2011-08-12 13:04 ` Peter Zijlstra 2011-08-12 14:20 ` Wu Fengguang 2011-08-12 14:20 ` Wu Fengguang 2011-08-22 15:38 ` Peter Zijlstra 2011-08-22 15:38 ` Peter Zijlstra 2011-08-22 15:38 ` Peter Zijlstra 2011-08-23 3:40 ` Wu Fengguang 2011-08-23 3:40 ` Wu Fengguang 2011-08-23 10:01 ` Peter Zijlstra 2011-08-23 10:01 ` Peter Zijlstra 2011-08-23 10:01 ` Peter Zijlstra 2011-08-23 14:15 ` Wu Fengguang 2011-08-23 14:15 ` Wu Fengguang 2011-08-23 17:47 ` Vivek Goyal 2011-08-23 17:47 ` Vivek Goyal 2011-08-24 0:12 ` Wu Fengguang 2011-08-24 0:12 ` Wu Fengguang 2011-08-24 16:12 ` Peter Zijlstra 2011-08-24 16:12 ` Peter Zijlstra 2011-08-26 0:18 ` Wu Fengguang 2011-08-26 0:18 ` Wu Fengguang 2011-08-26 9:04 ` Peter Zijlstra 2011-08-26 9:04 ` Peter Zijlstra 2011-08-26 10:04 ` Wu Fengguang 2011-08-26 10:04 ` Wu Fengguang 2011-08-26 10:42 ` Peter Zijlstra 2011-08-26 10:42 ` Peter Zijlstra 2011-08-26 10:52 ` Wu Fengguang 2011-08-26 10:52 ` Wu Fengguang 2011-08-26 11:26 ` Wu Fengguang 2011-08-26 12:11 ` Peter Zijlstra 2011-08-26 12:11 ` Peter Zijlstra 2011-08-26 12:20 ` Wu Fengguang 2011-08-26 12:20 ` Wu Fengguang 2011-08-26 13:13 ` Wu Fengguang 2011-08-26 13:18 ` Peter Zijlstra 2011-08-26 13:18 ` Peter Zijlstra 2011-08-26 13:24 ` Wu Fengguang 2011-08-26 13:24 ` Wu Fengguang 2011-08-24 18:00 ` Vivek Goyal 2011-08-24 18:00 ` Vivek Goyal 2011-08-25 3:19 ` Wu Fengguang 2011-08-25 3:19 ` Wu Fengguang 2011-08-25 22:20 ` Vivek Goyal 2011-08-25 22:20 ` Vivek Goyal 2011-08-26 1:56 ` Wu Fengguang 2011-08-26 1:56 ` Wu Fengguang 2011-08-26 8:56 ` Peter Zijlstra 2011-08-26 8:56 ` Peter Zijlstra 2011-08-26 9:53 ` Wu Fengguang 2011-08-26 9:53 ` Wu Fengguang 2011-08-29 13:12 ` Peter Zijlstra 2011-08-29 13:12 ` Peter Zijlstra 2011-08-29 13:37 ` Wu Fengguang 2011-08-29 13:37 ` Wu Fengguang 2011-09-02 12:16 ` Peter Zijlstra 2011-09-02 12:16 ` Peter Zijlstra 2011-09-06 12:40 ` Peter Zijlstra 2011-09-06 12:40 ` Peter Zijlstra 2011-08-24 15:57 ` Peter Zijlstra 2011-08-24 15:57 ` Peter Zijlstra 2011-08-24 15:57 ` Peter Zijlstra 2011-08-25 5:30 ` Wu Fengguang 2011-08-25 5:30 ` Wu Fengguang 2011-08-23 14:36 ` Vivek Goyal 2011-08-23 14:36 ` Vivek Goyal 2011-08-09 2:08 ` Vivek Goyal 2011-08-09 2:08 ` Vivek Goyal 2011-08-16 8:59 ` Wu Fengguang 2011-08-16 8:59 ` Wu Fengguang 2011-08-06 8:44 ` [PATCH 3/5] writeback: dirty rate control Wu Fengguang 2011-08-06 8:44 ` Wu Fengguang 2011-08-06 8:44 ` Wu Fengguang 2011-08-09 14:54 ` Vivek Goyal 2011-08-09 14:54 ` Vivek Goyal 2011-08-11 3:42 ` Wu Fengguang 2011-08-11 3:42 ` Wu Fengguang 2011-08-09 14:57 ` Peter Zijlstra 2011-08-09 14:57 ` Peter Zijlstra 2011-08-09 14:57 ` Peter Zijlstra 2011-08-10 11:07 ` Wu Fengguang 2011-08-10 11:07 ` Wu Fengguang 2011-08-10 16:17 ` Peter Zijlstra 2011-08-10 16:17 ` Peter Zijlstra 2011-08-10 16:17 ` Peter Zijlstra 2011-08-15 14:08 ` Wu Fengguang 2011-08-15 14:08 ` Wu Fengguang 2011-08-09 15:50 ` Vivek Goyal 2011-08-09 15:50 ` Vivek Goyal 2011-08-09 16:16 ` Peter Zijlstra 2011-08-09 16:16 ` Peter Zijlstra 2011-08-09 16:16 ` Peter Zijlstra 2011-08-09 16:19 ` Peter Zijlstra 2011-08-09 16:19 ` Peter Zijlstra 2011-08-09 16:19 ` Peter Zijlstra 2011-08-10 14:07 ` Wu Fengguang 2011-08-10 14:07 ` Wu Fengguang 2011-08-10 14:00 ` Wu Fengguang 2011-08-10 14:00 ` Wu Fengguang 2011-08-10 17:10 ` Peter Zijlstra 2011-08-10 17:10 ` Peter Zijlstra 2011-08-15 14:11 ` Wu Fengguang 2011-08-15 14:11 ` Wu Fengguang 2011-08-09 16:56 ` Peter Zijlstra 2011-08-09 16:56 ` Peter Zijlstra 2011-08-09 16:56 ` Peter Zijlstra 2011-08-10 14:10 ` Wu Fengguang 2011-08-09 17:02 ` Peter Zijlstra 2011-08-09 17:02 ` Peter Zijlstra 2011-08-09 17:02 ` Peter Zijlstra 2011-08-10 14:15 ` Wu Fengguang 2011-08-10 14:15 ` Wu Fengguang 2011-08-06 8:44 ` [PATCH 4/5] writeback: per task dirty rate limit Wu Fengguang 2011-08-06 8:44 ` Wu Fengguang 2011-08-06 8:44 ` Wu Fengguang 2011-08-06 14:35 ` Andrea Righi 2011-08-06 14:35 ` Andrea Righi 2011-08-07 6:19 ` Wu Fengguang 2011-08-07 6:19 ` Wu Fengguang 2011-08-08 13:47 ` Peter Zijlstra 2011-08-08 13:47 ` Peter Zijlstra 2011-08-08 13:47 ` Peter Zijlstra 2011-08-08 14:21 ` Wu Fengguang 2011-08-08 14:21 ` Wu Fengguang 2011-08-08 23:32 ` Wu Fengguang 2011-08-08 23:32 ` Wu Fengguang 2011-08-08 14:23 ` Wu Fengguang 2011-08-08 14:23 ` Wu Fengguang 2011-08-08 14:26 ` Peter Zijlstra 2011-08-08 14:26 ` Peter Zijlstra 2011-08-08 14:26 ` Peter Zijlstra 2011-08-08 22:38 ` Wu Fengguang 2011-08-08 22:38 ` Wu Fengguang 2011-08-13 16:28 ` Andrea Righi 2011-08-13 16:28 ` Andrea Righi 2011-08-15 14:21 ` Wu Fengguang 2011-08-15 14:26 ` Andrea Righi 2011-08-15 14:26 ` Andrea Righi 2011-08-09 17:46 ` Vivek Goyal 2011-08-09 17:46 ` Vivek Goyal 2011-08-10 3:29 ` Wu Fengguang 2011-08-10 3:29 ` Wu Fengguang 2011-08-10 18:18 ` Vivek Goyal 2011-08-10 18:18 ` Vivek Goyal 2011-08-11 0:55 ` Wu Fengguang 2011-08-11 0:55 ` Wu Fengguang 2011-08-09 18:35 ` Peter Zijlstra 2011-08-09 18:35 ` Peter Zijlstra 2011-08-09 18:35 ` Peter Zijlstra 2011-08-10 3:40 ` Wu Fengguang 2011-08-10 3:40 ` Wu Fengguang 2011-08-10 10:25 ` Peter Zijlstra 2011-08-10 10:25 ` Peter Zijlstra 2011-08-10 10:25 ` Peter Zijlstra 2011-08-10 11:13 ` Wu Fengguang 2011-08-10 11:13 ` Wu Fengguang 2011-08-06 8:44 ` [PATCH 5/5] writeback: IO-less balance_dirty_pages() Wu Fengguang 2011-08-06 8:44 ` Wu Fengguang 2011-08-06 8:44 ` Wu Fengguang 2011-08-06 14:48 ` Andrea Righi 2011-08-06 14:48 ` Andrea Righi 2011-08-06 14:48 ` Andrea Righi 2011-08-07 6:44 ` Wu Fengguang 2011-08-07 6:44 ` Wu Fengguang 2011-08-07 6:44 ` Wu Fengguang 2011-08-06 16:46 ` Andrea Righi 2011-08-06 16:46 ` Andrea Righi 2011-08-07 7:18 ` Wu Fengguang 2011-08-07 9:50 ` Andrea Righi 2011-08-07 9:50 ` Andrea Righi 2011-08-09 18:15 ` Vivek Goyal 2011-08-09 18:15 ` Vivek Goyal 2011-08-09 18:41 ` Peter Zijlstra 2011-08-09 18:41 ` Peter Zijlstra 2011-08-09 18:41 ` Peter Zijlstra 2011-08-10 3:22 ` Wu Fengguang 2011-08-10 3:22 ` Wu Fengguang 2011-08-10 3:26 ` Wu Fengguang 2011-08-10 3:26 ` Wu Fengguang 2011-08-09 19:16 ` Vivek Goyal 2011-08-09 19:16 ` Vivek Goyal 2011-08-10 4:33 ` Wu Fengguang 2011-08-09 2:01 ` Vivek Goyal [this message] 2011-08-09 2:01 ` [PATCH 0/5] IO-less dirty throttling v8 Vivek Goyal 2011-08-09 5:55 ` Dave Chinner 2011-08-09 5:55 ` Dave Chinner 2011-08-09 14:04 ` Vivek Goyal 2011-08-09 14:04 ` Vivek Goyal 2011-08-10 7:41 ` Greg Thelen 2011-08-10 7:41 ` Greg Thelen 2011-08-10 7:41 ` Greg Thelen 2011-08-10 18:40 ` Vivek Goyal 2011-08-10 18:40 ` Vivek Goyal 2011-08-10 18:40 ` Vivek Goyal 2011-08-11 3:21 ` Wu Fengguang 2011-08-11 3:21 ` Wu Fengguang 2011-08-11 20:42 ` Vivek Goyal 2011-08-11 20:42 ` Vivek Goyal 2011-08-11 21:00 ` Vivek Goyal 2011-08-11 21:00 ` Vivek Goyal
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20110809020127.GA3700@redhat.com \ --to=vgoyal@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=arighi@develer.com \ --cc=david@fromorbit.com \ --cc=fengguang.wu@intel.com \ --cc=gthelen@google.com \ --cc=hch@lst.de \ --cc=jack@suse.cz \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=minchan.kim@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.