From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx176.postini.com [74.125.245.176]) by kanga.kvack.org (Postfix) with SMTP id C868F6B004A for ; Wed, 15 Feb 2012 22:53:53 -0500 (EST) Received: from m4.gw.fujitsu.co.jp (unknown [10.0.50.74]) by fgwmail5.fujitsu.co.jp (Postfix) with ESMTP id 9E79B3EE0BC for ; Thu, 16 Feb 2012 12:53:51 +0900 (JST) Received: from smail (m4 [127.0.0.1]) by outgoing.m4.gw.fujitsu.co.jp (Postfix) with ESMTP id 8009645DE54 for ; Thu, 16 Feb 2012 12:53:51 +0900 (JST) Received: from s4.gw.fujitsu.co.jp (s4.gw.fujitsu.co.jp [10.0.50.94]) by m4.gw.fujitsu.co.jp (Postfix) with ESMTP id EBD5B45DD74 for ; Thu, 16 Feb 2012 12:53:50 +0900 (JST) Received: from s4.gw.fujitsu.co.jp (localhost.localdomain [127.0.0.1]) by s4.gw.fujitsu.co.jp (Postfix) with ESMTP id DEC2F1DB8042 for ; Thu, 16 Feb 2012 12:53:50 +0900 (JST) Received: from ml13.s.css.fujitsu.com (ml13.s.css.fujitsu.com [10.240.81.133]) by s4.gw.fujitsu.co.jp (Postfix) with ESMTP id 87A9F1DB803E for ; Thu, 16 Feb 2012 12:53:50 +0900 (JST) Date: Thu, 16 Feb 2012 12:52:21 +0900 From: KAMEZAWA Hiroyuki Subject: Re: reclaim the LRU lists full of dirty/writeback pages Message-Id: <20120216125221.81424ebc.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <20120216030415.GA17597@localhost> References: <20120208093120.GA18993@localhost> <20120210114706.GA4704@localhost> <20120211124445.GA10826@localhost> <20120214101931.GB5938@suse.de> <20120214131812.GA17625@localhost> <20120216090037.31d04ec7.kamezawa.hiroyu@jp.fujitsu.com> <20120216030415.GA17597@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Wu Fengguang Cc: Mel Gorman , Greg Thelen , Jan Kara , "bsingharora@gmail.com" , Hugh Dickins , Michal Hocko , linux-mm@kvack.org, Ying Han , "hannes@cmpxchg.org" , Rik van Riel , Minchan Kim On Thu, 16 Feb 2012 11:04:15 +0800 Wu Fengguang wrote: > On Thu, Feb 16, 2012 at 09:00:37AM +0900, KAMEZAWA Hiroyuki wrote: > > On Tue, 14 Feb 2012 21:18:12 +0800 > > Wu Fengguang wrote: > > > > > > > > --- linux.orig/include/linux/backing-dev.h 2012-02-14 19:43:06.000000000 +0800 > > > +++ linux/include/linux/backing-dev.h 2012-02-14 19:49:26.000000000 +0800 > > > @@ -304,6 +304,8 @@ void clear_bdi_congested(struct backing_ > > > void set_bdi_congested(struct backing_dev_info *bdi, int sync); > > > long congestion_wait(int sync, long timeout); > > > long wait_iff_congested(struct zone *zone, int sync, long timeout); > > > +long reclaim_wait(long timeout); > > > +void reclaim_rotated(void); > > > > > > static inline bool bdi_cap_writeback_dirty(struct backing_dev_info *bdi) > > > { > > > --- linux.orig/mm/backing-dev.c 2012-02-14 19:26:15.000000000 +0800 > > > +++ linux/mm/backing-dev.c 2012-02-14 20:09:45.000000000 +0800 > > > @@ -873,3 +873,38 @@ out: > > > return ret; > > > } > > > EXPORT_SYMBOL(wait_iff_congested); > > > + > > > +static DECLARE_WAIT_QUEUE_HEAD(reclaim_wqh); > > > + > > > +/** > > > + * reclaim_wait - wait for some pages being rotated to the LRU tail > > > + * @timeout: timeout in jiffies > > > + * > > > + * Wait until @timeout, or when some (typically PG_reclaim under writeback) > > > + * pages rotated to the LRU so that page reclaim can make progress. > > > + */ > > > +long reclaim_wait(long timeout) > > > +{ > > > + long ret; > > > + unsigned long start = jiffies; > > > + DEFINE_WAIT(wait); > > > + > > > + prepare_to_wait(&reclaim_wqh, &wait, TASK_KILLABLE); > > > + ret = io_schedule_timeout(timeout); > > > + finish_wait(&reclaim_wqh, &wait); > > > + > > > + trace_writeback_reclaim_wait(jiffies_to_usecs(timeout), > > > + jiffies_to_usecs(jiffies - start)); > > > + > > > + return ret; > > > +} > > > +EXPORT_SYMBOL(reclaim_wait); > > > + > > > +void reclaim_rotated() > > > +{ > > > + wait_queue_head_t *wqh = &reclaim_wqh; > > > + > > > + if (waitqueue_active(wqh)) > > > + wake_up(wqh); > > > +} > > > + > > > > Thank you. > > > > I like this approach. A nitpick is that this may wake up all waiters > > in the system when a memcg is rotated. > > Thank you. It sure helps to start it simple :-) > > > How about wait_event() + condition by bitmap (using per memcg unique IDs.) ? > > I'm not sure how to manage the bitmap. The idea in my mind is to > > - maintain a memcg->pages_rotated counter > > - in reclaim_wait(), grab the current ->pages_rotated value before > going to wait, compare it to the new value on every wakeup, and > return to the user when seeing a different ->pages_rotated value. > (this cannot stop waking up multiple tasks in the same memcg...) > > Does that sound reasonable? > Maybe. But there may be problem in looking up memcg from page at every rotation. I think it's ok to start with a way ignoring per-memcg status. Sorry for noise. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org