From: Wu Fengguang <fengguang.wu@intel.com> To: "Li, Shaohua" <shaohua.li@intel.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>, "linux-mm@kvack.org" <linux-mm@kvack.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "akpm@linux-foundation.org" <akpm@linux-foundation.org> Subject: Re: [PATCH]vmscan: handle underflow for get_scan_ratio Date: Wed, 31 Mar 2010 13:41:09 +0800 [thread overview] Message-ID: <20100331054109.GA21371@localhost> (raw) In-Reply-To: <20100331045348.GA3396@sli10-desk.sh.intel.com> On Wed, Mar 31, 2010 at 12:53:48PM +0800, Li, Shaohua wrote: > On Tue, Mar 30, 2010 at 02:08:53PM +0800, KOSAKI Motohiro wrote: > > Hi > > > > > Commit 84b18490d1f1bc7ed5095c929f78bc002eb70f26 introduces a regression. > > > With it, our tmpfs test always oom. The test has a lot of rotated anon > > > pages and cause percent[0] zero. Actually the percent[0] is a very small > > > value, but our calculation round it to zero. The commit makes vmscan > > > completely skip anon pages and cause oops. > > > An option is if percent[x] is zero in get_scan_ratio(), forces it > > > to 1. See below patch. > > > But the offending commit still changes behavior. Without the commit, we scan > > > all pages if priority is zero, below patch doesn't fix this. Don't know if > > > It's required to fix this too. > > > > Can you please post your /proc/meminfo and reproduce program? I'll digg it. > > > > Very unfortunately, this patch isn't acceptable. In past time, vmscan > > had similar logic, but 1% swap-out made lots bug reports. > if 1% is still big, how about below patch? > > Commit 84b18490d1f1bc7ed5095c929f78bc002eb70f26 introduces a regression. > With it, our tmpfs test always oom. The test has a lot of rotated anon > pages and cause percent[0] zero. Actually the percent[0] is a very small > value, but our calculation round it to zero. The commit makes vmscan > completely skip anon pages and cause oops. > To avoid underflow, we don't use percentage, instead we directly calculate > how many pages should be scaned. The changelog can be improved. For example, to describe these items in separate paragraphs: - the behavior change introduced by 84b18490d1f (which claims to be cleanup) - the tmpfs test case - the root cause - the solution - test result > Signed-off-by: Shaohua Li <shaohua.li@intel.com> > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 79c8098..80a7ed5 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1519,27 +1519,50 @@ static unsigned long shrink_list(enum lru_list lru, unsigned long nr_to_scan, > } > > /* > + * Smallish @nr_to_scan's are deposited in @nr_saved_scan, > + * until we collected @swap_cluster_max pages to scan. > + */ > +static unsigned long nr_scan_try_batch(unsigned long nr_to_scan, > + unsigned long *nr_saved_scan) > +{ > + unsigned long nr; > + > + *nr_saved_scan += nr_to_scan; > + nr = *nr_saved_scan; > + > + if (nr >= SWAP_CLUSTER_MAX) > + *nr_saved_scan = 0; > + else > + nr = 0; > + > + return nr; > +} > + > +/* > * Determine how aggressively the anon and file LRU lists should be > * scanned. The relative value of each set of LRU lists is determined > * by looking at the fraction of the pages scanned we did rotate back > * onto the active list instead of evict. > * > - * percent[0] specifies how much pressure to put on ram/swap backed > - * memory, while percent[1] determines pressure on the file LRUs. > + * nr[x] specifies how many pages should be scaned typo: scanned > */ > -static void get_scan_ratio(struct zone *zone, struct scan_control *sc, > - unsigned long *percent) > +static void get_scan_count(struct zone *zone, struct scan_control *sc, > + unsigned long *nr, int priority) > { > unsigned long anon, file, free; > unsigned long anon_prio, file_prio; > unsigned long ap, fp; > struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc); > + unsigned long fraction[2], denominator[2]; The denominator is shared, so one scaler would be sufficient. Also ap, fp can be removed and to use fraction[] directly. And it's better to retain this comment: /* anon @ 0; file @ 1 */ > + enum lru_list l; > > /* If we have no swap space, do not bother scanning anon pages. */ > if (!sc->may_swap || (nr_swap_pages <= 0)) { > - percent[0] = 0; > - percent[1] = 100; > - return; > + fraction[0] = 0; > + denominator[0] = 1; > + fraction[1] = 1; > + denominator[1] = 1; > + goto out; > } > > anon = zone_nr_lru_pages(zone, sc, LRU_ACTIVE_ANON) + > @@ -1552,9 +1575,11 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc, > /* If we have very few page cache pages, > force-scan anon pages. */ > if (unlikely(file + free <= high_wmark_pages(zone))) { > - percent[0] = 100; > - percent[1] = 0; > - return; > + fraction[0] = 1; > + denominator[0] = 1; > + fraction[1] = 0; > + denominator[1] = 1; > + goto out; > } > } > > @@ -1601,29 +1626,29 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc, > fp = (file_prio + 1) * (reclaim_stat->recent_scanned[1] + 1); > fp /= reclaim_stat->recent_rotated[1] + 1; > > - /* Normalize to percentages */ > - percent[0] = 100 * ap / (ap + fp + 1); > - percent[1] = 100 - percent[0]; > -} > - > -/* > - * Smallish @nr_to_scan's are deposited in @nr_saved_scan, > - * until we collected @swap_cluster_max pages to scan. > - */ > -static unsigned long nr_scan_try_batch(unsigned long nr_to_scan, > - unsigned long *nr_saved_scan) > -{ > - unsigned long nr; > + fraction[0] = ap; > + denominator[0] = ap + fp + 1; > + fraction[1] = fp; > + denominator[1] = ap + fp + 1; > > - *nr_saved_scan += nr_to_scan; > - nr = *nr_saved_scan; > +out: > + for_each_evictable_lru(l) { > + int file = is_file_lru(l); > + unsigned long scan; > > - if (nr >= SWAP_CLUSTER_MAX) > - *nr_saved_scan = 0; > - else > - nr = 0; > + if (fraction[file] == 0) { > + nr[l] = 0; > + continue; > + } > > - return nr; > + scan = zone_nr_lru_pages(zone, sc, l); > + if (priority) { > + scan >>= priority; > + scan = (scan * fraction[file] / denominator[file]); scan = (scan * fraction[file]) / denominator[file]; Thanks, Fengguang > + } > + nr[l] = nr_scan_try_batch(scan, > + &reclaim_stat->nr_saved_scan[l]); > + } > } > > /* > @@ -1634,31 +1659,11 @@ static void shrink_zone(int priority, struct zone *zone, > { > unsigned long nr[NR_LRU_LISTS]; > unsigned long nr_to_scan; > - unsigned long percent[2]; /* anon @ 0; file @ 1 */ > enum lru_list l; > unsigned long nr_reclaimed = sc->nr_reclaimed; > unsigned long nr_to_reclaim = sc->nr_to_reclaim; > - struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc); > - > - get_scan_ratio(zone, sc, percent); > > - for_each_evictable_lru(l) { > - int file = is_file_lru(l); > - unsigned long scan; > - > - if (percent[file] == 0) { > - nr[l] = 0; > - continue; > - } > - > - scan = zone_nr_lru_pages(zone, sc, l); > - if (priority) { > - scan >>= priority; > - scan = (scan * percent[file]) / 100; > - } > - nr[l] = nr_scan_try_batch(scan, > - &reclaim_stat->nr_saved_scan[l]); > - } > + get_scan_count(zone, sc, nr, priority); > > while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] || > nr[LRU_INACTIVE_FILE]) {
WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com> To: "Li, Shaohua" <shaohua.li@intel.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>, "linux-mm@kvack.org" <linux-mm@kvack.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "akpm@linux-foundation.org" <akpm@linux-foundation.org> Subject: Re: [PATCH]vmscan: handle underflow for get_scan_ratio Date: Wed, 31 Mar 2010 13:41:09 +0800 [thread overview] Message-ID: <20100331054109.GA21371@localhost> (raw) In-Reply-To: <20100331045348.GA3396@sli10-desk.sh.intel.com> On Wed, Mar 31, 2010 at 12:53:48PM +0800, Li, Shaohua wrote: > On Tue, Mar 30, 2010 at 02:08:53PM +0800, KOSAKI Motohiro wrote: > > Hi > > > > > Commit 84b18490d1f1bc7ed5095c929f78bc002eb70f26 introduces a regression. > > > With it, our tmpfs test always oom. The test has a lot of rotated anon > > > pages and cause percent[0] zero. Actually the percent[0] is a very small > > > value, but our calculation round it to zero. The commit makes vmscan > > > completely skip anon pages and cause oops. > > > An option is if percent[x] is zero in get_scan_ratio(), forces it > > > to 1. See below patch. > > > But the offending commit still changes behavior. Without the commit, we scan > > > all pages if priority is zero, below patch doesn't fix this. Don't know if > > > It's required to fix this too. > > > > Can you please post your /proc/meminfo and reproduce program? I'll digg it. > > > > Very unfortunately, this patch isn't acceptable. In past time, vmscan > > had similar logic, but 1% swap-out made lots bug reports. > if 1% is still big, how about below patch? > > Commit 84b18490d1f1bc7ed5095c929f78bc002eb70f26 introduces a regression. > With it, our tmpfs test always oom. The test has a lot of rotated anon > pages and cause percent[0] zero. Actually the percent[0] is a very small > value, but our calculation round it to zero. The commit makes vmscan > completely skip anon pages and cause oops. > To avoid underflow, we don't use percentage, instead we directly calculate > how many pages should be scaned. The changelog can be improved. For example, to describe these items in separate paragraphs: - the behavior change introduced by 84b18490d1f (which claims to be cleanup) - the tmpfs test case - the root cause - the solution - test result > Signed-off-by: Shaohua Li <shaohua.li@intel.com> > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 79c8098..80a7ed5 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1519,27 +1519,50 @@ static unsigned long shrink_list(enum lru_list lru, unsigned long nr_to_scan, > } > > /* > + * Smallish @nr_to_scan's are deposited in @nr_saved_scan, > + * until we collected @swap_cluster_max pages to scan. > + */ > +static unsigned long nr_scan_try_batch(unsigned long nr_to_scan, > + unsigned long *nr_saved_scan) > +{ > + unsigned long nr; > + > + *nr_saved_scan += nr_to_scan; > + nr = *nr_saved_scan; > + > + if (nr >= SWAP_CLUSTER_MAX) > + *nr_saved_scan = 0; > + else > + nr = 0; > + > + return nr; > +} > + > +/* > * Determine how aggressively the anon and file LRU lists should be > * scanned. The relative value of each set of LRU lists is determined > * by looking at the fraction of the pages scanned we did rotate back > * onto the active list instead of evict. > * > - * percent[0] specifies how much pressure to put on ram/swap backed > - * memory, while percent[1] determines pressure on the file LRUs. > + * nr[x] specifies how many pages should be scaned typo: scanned > */ > -static void get_scan_ratio(struct zone *zone, struct scan_control *sc, > - unsigned long *percent) > +static void get_scan_count(struct zone *zone, struct scan_control *sc, > + unsigned long *nr, int priority) > { > unsigned long anon, file, free; > unsigned long anon_prio, file_prio; > unsigned long ap, fp; > struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc); > + unsigned long fraction[2], denominator[2]; The denominator is shared, so one scaler would be sufficient. Also ap, fp can be removed and to use fraction[] directly. And it's better to retain this comment: /* anon @ 0; file @ 1 */ > + enum lru_list l; > > /* If we have no swap space, do not bother scanning anon pages. */ > if (!sc->may_swap || (nr_swap_pages <= 0)) { > - percent[0] = 0; > - percent[1] = 100; > - return; > + fraction[0] = 0; > + denominator[0] = 1; > + fraction[1] = 1; > + denominator[1] = 1; > + goto out; > } > > anon = zone_nr_lru_pages(zone, sc, LRU_ACTIVE_ANON) + > @@ -1552,9 +1575,11 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc, > /* If we have very few page cache pages, > force-scan anon pages. */ > if (unlikely(file + free <= high_wmark_pages(zone))) { > - percent[0] = 100; > - percent[1] = 0; > - return; > + fraction[0] = 1; > + denominator[0] = 1; > + fraction[1] = 0; > + denominator[1] = 1; > + goto out; > } > } > > @@ -1601,29 +1626,29 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc, > fp = (file_prio + 1) * (reclaim_stat->recent_scanned[1] + 1); > fp /= reclaim_stat->recent_rotated[1] + 1; > > - /* Normalize to percentages */ > - percent[0] = 100 * ap / (ap + fp + 1); > - percent[1] = 100 - percent[0]; > -} > - > -/* > - * Smallish @nr_to_scan's are deposited in @nr_saved_scan, > - * until we collected @swap_cluster_max pages to scan. > - */ > -static unsigned long nr_scan_try_batch(unsigned long nr_to_scan, > - unsigned long *nr_saved_scan) > -{ > - unsigned long nr; > + fraction[0] = ap; > + denominator[0] = ap + fp + 1; > + fraction[1] = fp; > + denominator[1] = ap + fp + 1; > > - *nr_saved_scan += nr_to_scan; > - nr = *nr_saved_scan; > +out: > + for_each_evictable_lru(l) { > + int file = is_file_lru(l); > + unsigned long scan; > > - if (nr >= SWAP_CLUSTER_MAX) > - *nr_saved_scan = 0; > - else > - nr = 0; > + if (fraction[file] == 0) { > + nr[l] = 0; > + continue; > + } > > - return nr; > + scan = zone_nr_lru_pages(zone, sc, l); > + if (priority) { > + scan >>= priority; > + scan = (scan * fraction[file] / denominator[file]); scan = (scan * fraction[file]) / denominator[file]; Thanks, Fengguang > + } > + nr[l] = nr_scan_try_batch(scan, > + &reclaim_stat->nr_saved_scan[l]); > + } > } > > /* > @@ -1634,31 +1659,11 @@ static void shrink_zone(int priority, struct zone *zone, > { > unsigned long nr[NR_LRU_LISTS]; > unsigned long nr_to_scan; > - unsigned long percent[2]; /* anon @ 0; file @ 1 */ > enum lru_list l; > unsigned long nr_reclaimed = sc->nr_reclaimed; > unsigned long nr_to_reclaim = sc->nr_to_reclaim; > - struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc); > - > - get_scan_ratio(zone, sc, percent); > > - for_each_evictable_lru(l) { > - int file = is_file_lru(l); > - unsigned long scan; > - > - if (percent[file] == 0) { > - nr[l] = 0; > - continue; > - } > - > - scan = zone_nr_lru_pages(zone, sc, l); > - if (priority) { > - scan >>= priority; > - scan = (scan * percent[file]) / 100; > - } > - nr[l] = nr_scan_try_batch(scan, > - &reclaim_stat->nr_saved_scan[l]); > - } > + get_scan_count(zone, sc, nr, priority); > > while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] || > nr[LRU_INACTIVE_FILE]) { -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-03-31 5:41 UTC|newest] Thread overview: 97+ messages / expand[flat|nested] mbox.gz Atom feed top 2010-03-30 5:53 [PATCH]vmscan: handle underflow for get_scan_ratio Shaohua Li 2010-03-30 5:53 ` Shaohua Li 2010-03-30 6:08 ` KOSAKI Motohiro 2010-03-30 6:08 ` KOSAKI Motohiro 2010-03-30 6:32 ` Shaohua Li 2010-03-30 6:40 ` KOSAKI Motohiro 2010-03-30 6:40 ` KOSAKI Motohiro 2010-03-30 6:53 ` Shaohua Li 2010-03-30 6:53 ` Shaohua Li 2010-03-30 7:31 ` KOSAKI Motohiro 2010-03-30 7:31 ` KOSAKI Motohiro 2010-03-30 8:13 ` Shaohua Li 2010-03-30 8:13 ` Shaohua Li 2010-03-31 4:53 ` Shaohua Li 2010-03-31 4:53 ` Shaohua Li 2010-03-31 5:38 ` KOSAKI Motohiro 2010-03-31 5:38 ` KOSAKI Motohiro 2010-03-31 5:51 ` Wu Fengguang 2010-03-31 5:51 ` Wu Fengguang 2010-03-31 6:00 ` KOSAKI Motohiro 2010-03-31 6:00 ` KOSAKI Motohiro 2010-03-31 6:03 ` Wu Fengguang 2010-03-31 6:03 ` Wu Fengguang 2010-04-01 22:16 ` Andrew Morton 2010-04-01 22:16 ` Andrew Morton 2010-04-02 9:13 ` KOSAKI Motohiro 2010-04-02 9:13 ` KOSAKI Motohiro 2010-04-06 1:22 ` Wu Fengguang 2010-04-06 1:22 ` Wu Fengguang 2010-04-06 3:36 ` Rik van Riel 2010-04-06 3:36 ` Rik van Riel 2010-03-31 5:53 ` KOSAKI Motohiro 2010-03-31 5:53 ` KOSAKI Motohiro 2010-04-02 6:50 ` Shaohua Li 2010-04-02 6:50 ` Shaohua Li 2010-04-02 9:14 ` KOSAKI Motohiro 2010-04-02 9:14 ` KOSAKI Motohiro 2010-04-02 9:24 ` Shaohua Li 2010-04-02 9:24 ` Shaohua Li 2010-04-04 14:19 ` KOSAKI Motohiro 2010-04-04 14:19 ` KOSAKI Motohiro 2010-04-06 1:25 ` Shaohua Li 2010-04-06 1:25 ` Shaohua Li 2010-04-06 1:36 ` KOSAKI Motohiro 2010-04-06 1:36 ` KOSAKI Motohiro 2010-04-06 1:50 ` Wu Fengguang 2010-04-06 1:50 ` Wu Fengguang 2010-04-06 2:06 ` KOSAKI Motohiro 2010-04-06 2:06 ` KOSAKI Motohiro 2010-04-06 2:30 ` Wu Fengguang 2010-04-06 2:30 ` Wu Fengguang 2010-04-06 2:58 ` KOSAKI Motohiro 2010-04-06 2:58 ` KOSAKI Motohiro 2010-04-06 3:31 ` Wu Fengguang 2010-04-06 3:31 ` Wu Fengguang 2010-04-06 3:40 ` Rik van Riel 2010-04-06 3:40 ` Rik van Riel 2010-04-06 4:49 ` Wu Fengguang 2010-04-06 4:49 ` Wu Fengguang 2010-04-06 5:09 ` Shaohua Li 2010-04-06 5:09 ` Shaohua Li 2010-04-04 0:48 ` Wu Fengguang 2010-04-04 0:48 ` Wu Fengguang 2010-04-06 1:27 ` Shaohua Li 2010-04-06 1:27 ` Shaohua Li 2010-04-06 5:03 ` Wu Fengguang 2010-04-06 5:03 ` Wu Fengguang 2010-04-06 5:36 ` Shaohua Li 2010-04-06 5:36 ` Shaohua Li 2010-04-09 6:51 ` Shaohua Li 2010-04-09 6:51 ` Shaohua Li 2010-04-09 21:20 ` Andrew Morton 2010-04-09 21:20 ` Andrew Morton 2010-04-09 21:25 ` Rik van Riel 2010-04-09 21:25 ` Rik van Riel 2010-04-13 1:30 ` KOSAKI Motohiro 2010-04-13 1:30 ` KOSAKI Motohiro 2010-04-13 2:42 ` Rik van Riel 2010-04-13 2:42 ` Rik van Riel 2010-04-13 7:55 ` KOSAKI Motohiro 2010-04-13 7:55 ` KOSAKI Motohiro 2010-04-13 8:55 ` KOSAKI Motohiro 2010-04-13 8:55 ` KOSAKI Motohiro 2010-04-14 1:27 ` Shaohua Li 2010-04-14 1:27 ` Shaohua Li 2010-04-15 3:25 ` KOSAKI Motohiro 2010-04-15 3:25 ` KOSAKI Motohiro 2010-04-12 1:57 ` Shaohua Li 2010-04-12 1:57 ` Shaohua Li 2010-03-31 5:41 ` Wu Fengguang [this message] 2010-03-31 5:41 ` Wu Fengguang 2010-03-30 10:17 ` Minchan Kim 2010-03-30 10:17 ` Minchan Kim 2010-03-30 10:25 ` KOSAKI Motohiro 2010-03-30 10:25 ` KOSAKI Motohiro 2010-03-30 11:56 ` Balbir Singh 2010-03-30 11:56 ` Balbir Singh
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20100331054109.GA21371@localhost \ --to=fengguang.wu@intel.com \ --cc=akpm@linux-foundation.org \ --cc=kosaki.motohiro@jp.fujitsu.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=shaohua.li@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.