From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751476Ab3CSK6e (ORCPT ); Tue, 19 Mar 2013 06:58:34 -0400 Received: from cantor2.suse.de ([195.135.220.15]:60514 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751071Ab3CSK6c (ORCPT ); Tue, 19 Mar 2013 06:58:32 -0400 Date: Tue, 19 Mar 2013 10:58:28 +0000 From: Mel Gorman To: Wanpeng Li Cc: Linux-MM , Jiri Slaby , Valdis Kletnieks , Rik van Riel , Zlatko Calusic , Johannes Weiner , dormando , Satoru Moriya , Michal Hocko , LKML Subject: Re: [PATCH 07/10] mm: vmscan: Block kswapd if it is encountering pages under writeback Message-ID: <20130319105828.GI2055@suse.de> References: <1363525456-10448-1-git-send-email-mgorman@suse.de> <1363525456-10448-8-git-send-email-mgorman@suse.de> <20130318115827.GB7245@hacker.(null)> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20130318115827.GB7245@hacker.(null)> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 18, 2013 at 07:58:27PM +0800, Wanpeng Li wrote: > On Sun, Mar 17, 2013 at 01:04:13PM +0000, Mel Gorman wrote: > >Historically, kswapd used to congestion_wait() at higher priorities if it > >was not making forward progress. This made no sense as the failure to make > >progress could be completely independent of IO. It was later replaced by > >wait_iff_congested() and removed entirely by commit 258401a6 (mm: don't > >wait on congested zones in balance_pgdat()) as it was duplicating logic > >in shrink_inactive_list(). > > > >This is problematic. If kswapd encounters many pages under writeback and > >it continues to scan until it reaches the high watermark then it will > >quickly skip over the pages under writeback and reclaim clean young > >pages or push applications out to swap. > > > >The use of wait_iff_congested() is not suited to kswapd as it will only > >stall if the underlying BDI is really congested or a direct reclaimer was > >unable to write to the underlying BDI. kswapd bypasses the BDI congestion > >as it sets PF_SWAPWRITE but even if this was taken into account then it > >would cause direct reclaimers to stall on writeback which is not desirable. > > > >This patch sets a ZONE_WRITEBACK flag if direct reclaim or kswapd is > >encountering too many pages under writeback. If this flag is set and > >kswapd encounters a PageReclaim page under writeback then it'll assume > >that the LRU lists are being recycled too quickly before IO can complete > >and block waiting for some IO to complete. > > > >Signed-off-by: Mel Gorman > >--- > > include/linux/mmzone.h | 8 ++++++++ > > mm/vmscan.c | 29 ++++++++++++++++++++++++----- > > 2 files changed, 32 insertions(+), 5 deletions(-) > > > >diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > >index edd6b98..c758fb7 100644 > >--- a/include/linux/mmzone.h > >+++ b/include/linux/mmzone.h > >@@ -498,6 +498,9 @@ typedef enum { > > ZONE_DIRTY, /* reclaim scanning has recently found > > * many dirty file pages > > */ > >+ ZONE_WRITEBACK, /* reclaim scanning has recently found > >+ * many pages under writeback > >+ */ > > } zone_flags_t; > > > > static inline void zone_set_flag(struct zone *zone, zone_flags_t flag) > >@@ -525,6 +528,11 @@ static inline int zone_is_reclaim_dirty(const struct zone *zone) > > return test_bit(ZONE_DIRTY, &zone->flags); > > } > > > >+static inline int zone_is_reclaim_writeback(const struct zone *zone) > >+{ > >+ return test_bit(ZONE_WRITEBACK, &zone->flags); > >+} > >+ > > static inline int zone_is_reclaim_locked(const struct zone *zone) > > { > > return test_bit(ZONE_RECLAIM_LOCKED, &zone->flags); > >diff --git a/mm/vmscan.c b/mm/vmscan.c > >index 493728b..7d5a932 100644 > >--- a/mm/vmscan.c > >+++ b/mm/vmscan.c > >@@ -725,6 +725,19 @@ static unsigned long shrink_page_list(struct list_head *page_list, > > > > if (PageWriteback(page)) { > > /* > >+ * If reclaim is encountering an excessive number of > >+ * pages under writeback and this page is both under > > Is the comment should changed to "encountered an excessive number of > pages under writeback or this page is both under writeback and PageReclaim"? > See below: > I intended to check for PageReclaim as well but it got lost in a merge error. Fixed now. -- Mel Gorman SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx203.postini.com [74.125.245.203]) by kanga.kvack.org (Postfix) with SMTP id 139076B0006 for ; Tue, 19 Mar 2013 06:58:32 -0400 (EDT) Date: Tue, 19 Mar 2013 10:58:28 +0000 From: Mel Gorman Subject: Re: [PATCH 07/10] mm: vmscan: Block kswapd if it is encountering pages under writeback Message-ID: <20130319105828.GI2055@suse.de> References: <1363525456-10448-1-git-send-email-mgorman@suse.de> <1363525456-10448-8-git-send-email-mgorman@suse.de> <20130318115827.GB7245@hacker.(null)> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20130318115827.GB7245@hacker.(null)> Sender: owner-linux-mm@kvack.org List-ID: To: Wanpeng Li Cc: Linux-MM , Jiri Slaby , Valdis Kletnieks , Rik van Riel , Zlatko Calusic , Johannes Weiner , dormando , Satoru Moriya , Michal Hocko , LKML On Mon, Mar 18, 2013 at 07:58:27PM +0800, Wanpeng Li wrote: > On Sun, Mar 17, 2013 at 01:04:13PM +0000, Mel Gorman wrote: > >Historically, kswapd used to congestion_wait() at higher priorities if it > >was not making forward progress. This made no sense as the failure to make > >progress could be completely independent of IO. It was later replaced by > >wait_iff_congested() and removed entirely by commit 258401a6 (mm: don't > >wait on congested zones in balance_pgdat()) as it was duplicating logic > >in shrink_inactive_list(). > > > >This is problematic. If kswapd encounters many pages under writeback and > >it continues to scan until it reaches the high watermark then it will > >quickly skip over the pages under writeback and reclaim clean young > >pages or push applications out to swap. > > > >The use of wait_iff_congested() is not suited to kswapd as it will only > >stall if the underlying BDI is really congested or a direct reclaimer was > >unable to write to the underlying BDI. kswapd bypasses the BDI congestion > >as it sets PF_SWAPWRITE but even if this was taken into account then it > >would cause direct reclaimers to stall on writeback which is not desirable. > > > >This patch sets a ZONE_WRITEBACK flag if direct reclaim or kswapd is > >encountering too many pages under writeback. If this flag is set and > >kswapd encounters a PageReclaim page under writeback then it'll assume > >that the LRU lists are being recycled too quickly before IO can complete > >and block waiting for some IO to complete. > > > >Signed-off-by: Mel Gorman > >--- > > include/linux/mmzone.h | 8 ++++++++ > > mm/vmscan.c | 29 ++++++++++++++++++++++++----- > > 2 files changed, 32 insertions(+), 5 deletions(-) > > > >diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > >index edd6b98..c758fb7 100644 > >--- a/include/linux/mmzone.h > >+++ b/include/linux/mmzone.h > >@@ -498,6 +498,9 @@ typedef enum { > > ZONE_DIRTY, /* reclaim scanning has recently found > > * many dirty file pages > > */ > >+ ZONE_WRITEBACK, /* reclaim scanning has recently found > >+ * many pages under writeback > >+ */ > > } zone_flags_t; > > > > static inline void zone_set_flag(struct zone *zone, zone_flags_t flag) > >@@ -525,6 +528,11 @@ static inline int zone_is_reclaim_dirty(const struct zone *zone) > > return test_bit(ZONE_DIRTY, &zone->flags); > > } > > > >+static inline int zone_is_reclaim_writeback(const struct zone *zone) > >+{ > >+ return test_bit(ZONE_WRITEBACK, &zone->flags); > >+} > >+ > > static inline int zone_is_reclaim_locked(const struct zone *zone) > > { > > return test_bit(ZONE_RECLAIM_LOCKED, &zone->flags); > >diff --git a/mm/vmscan.c b/mm/vmscan.c > >index 493728b..7d5a932 100644 > >--- a/mm/vmscan.c > >+++ b/mm/vmscan.c > >@@ -725,6 +725,19 @@ static unsigned long shrink_page_list(struct list_head *page_list, > > > > if (PageWriteback(page)) { > > /* > >+ * If reclaim is encountering an excessive number of > >+ * pages under writeback and this page is both under > > Is the comment should changed to "encountered an excessive number of > pages under writeback or this page is both under writeback and PageReclaim"? > See below: > I intended to check for PageReclaim as well but it got lost in a merge error. Fixed now. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org