All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rik van Riel <riel@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, Dave Chinner <david@fromorbit.com>,
	Chris Mason <chris.mason@oracle.com>,
	Nick Piggin <npiggin@suse.de>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Christoph Hellwig <hch@infradead.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [PATCH 05/12] vmscan: kill prev_priority completely
Date: Wed, 16 Jun 2010 20:34:09 -0400	[thread overview]
Message-ID: <4C196D81.8090700@redhat.com> (raw)
In-Reply-To: <20100616171847.71703d1a.akpm@linux-foundation.org>

On 06/16/2010 08:18 PM, Andrew Morton wrote:
> On Wed, 16 Jun 2010 19:45:29 -0400
> Rik van Riel<riel@redhat.com>  wrote:
>
>> On 06/16/2010 07:37 PM, Andrew Morton wrote:
>>
>>> This would have been badder in earlier days when we were using the
>>> scanning priority to decide when to start unmapping pte-mapped pages -
>>> page reclaim would have been recirculating large blobs of mapped pages
>>> around the LRU until the priority had built to the level where we
>>> started to unmap them.
>>>
>>> However that priority-based decision got removed and right now I don't
>>> recall what it got replaced with.  Aren't we now unmapping pages way
>>> too early and suffering an increased major&minor fault rate?  Worried.
>>
>> We keep a different set of statistics to decide whether to
>> reclaim only page cache pages, or both page cache and
>> anonymous pages. The function get_scan_ratio parses those
>> statistics.
>
> I wasn't talking about anon-vs-file.  I was referring to mapped-file
> versus not-mapped file.  If the code sees a mapped page come off the
> tail of the LRU it'll just unmap and reclaim the thing.  This policy
> caused awful amounts of paging activity when someone started doing lots
> of read() activity, which is why the VM was changed to value mapped
> pagecache higher than unmapped pagecache.  Did this biasing get
> retained and if so, how?

It changed a little, but we still have it:

1) we do not deactivate active file pages if the active file
    list is smaller than the inactive file list - this protects
    the working set from streaming IO

2) we keep mapped referenced executable pages on the active file
    list if they got accessed while on the active list, while
    other file pages get deactivated unconditionally

> Does thrash-avoidance actually still work?

I suspect it does, but I have not actually tested that code
in years :)

>> I do not believe prev_priority will be very useful here, since
>> we'd like to start out with small scans whenever possible.
>
> Why?

For one, memory sizes today are a lot larger than they were
when 2.6.0 came out.

Secondly, we now know more exactly what is on each LRU list.
That should greatly reduce unnecessary turnover of the list.

For example, if we know there is no swap space available, we
will not bother scanning the anon LRU lists.

If we know there is not enough file cache left to get us up
to the zone high water mark, we will not bother scanning the
few remaining file pages.

Because of those simple checks (in get_scan_priority), I do
not expect that we will have to scan through all of memory
as frequently as we had to do in 2.6.0.

Furthermore, we unconditionally deactivate most active pages
and have a working used-once scheme for pages on the anon
lists.  This should also contribute to a reduction in the
number of pages that get scanned.

>> In that case, the prev_priority logic may have introduced the
>> kind of behavioural bug you describe above...
>>
>>> And one has to wonder: if we're making these incorrect decisions based
>>> upon a bogus view of the current scanning difficulty, why are these
>>> various priority-based thresholding heuristics even in there?  Are they
>>> doing anything useful?
>>
>> The prev_priority code was useful when we had filesystem and
>> swap backed pages mixed on the same LRU list.
>
> No, stop saying swap! ;)
>
> It's all to do with mapped pagecache versus unmapped pagecache.  "ytf
> does my browser get paged out all the time".

We have other measures in place now to protect the working set
on the file LRU lists (see above).  We are able to have those
measures in the kernel because we no longer have mixed LRU
lists.

-- 
All rights reversed

WARNING: multiple messages have this Message-ID (diff)
From: Rik van Riel <riel@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, Dave Chinner <david@fromorbit.com>,
	Chris Mason <chris.mason@oracle.com>,
	Nick Piggin <npiggin@suse.de>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Christoph Hellwig <hch@infradead.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [PATCH 05/12] vmscan: kill prev_priority completely
Date: Wed, 16 Jun 2010 20:34:09 -0400	[thread overview]
Message-ID: <4C196D81.8090700@redhat.com> (raw)
In-Reply-To: <20100616171847.71703d1a.akpm@linux-foundation.org>

On 06/16/2010 08:18 PM, Andrew Morton wrote:
> On Wed, 16 Jun 2010 19:45:29 -0400
> Rik van Riel<riel@redhat.com>  wrote:
>
>> On 06/16/2010 07:37 PM, Andrew Morton wrote:
>>
>>> This would have been badder in earlier days when we were using the
>>> scanning priority to decide when to start unmapping pte-mapped pages -
>>> page reclaim would have been recirculating large blobs of mapped pages
>>> around the LRU until the priority had built to the level where we
>>> started to unmap them.
>>>
>>> However that priority-based decision got removed and right now I don't
>>> recall what it got replaced with.  Aren't we now unmapping pages way
>>> too early and suffering an increased major&minor fault rate?  Worried.
>>
>> We keep a different set of statistics to decide whether to
>> reclaim only page cache pages, or both page cache and
>> anonymous pages. The function get_scan_ratio parses those
>> statistics.
>
> I wasn't talking about anon-vs-file.  I was referring to mapped-file
> versus not-mapped file.  If the code sees a mapped page come off the
> tail of the LRU it'll just unmap and reclaim the thing.  This policy
> caused awful amounts of paging activity when someone started doing lots
> of read() activity, which is why the VM was changed to value mapped
> pagecache higher than unmapped pagecache.  Did this biasing get
> retained and if so, how?

It changed a little, but we still have it:

1) we do not deactivate active file pages if the active file
    list is smaller than the inactive file list - this protects
    the working set from streaming IO

2) we keep mapped referenced executable pages on the active file
    list if they got accessed while on the active list, while
    other file pages get deactivated unconditionally

> Does thrash-avoidance actually still work?

I suspect it does, but I have not actually tested that code
in years :)

>> I do not believe prev_priority will be very useful here, since
>> we'd like to start out with small scans whenever possible.
>
> Why?

For one, memory sizes today are a lot larger than they were
when 2.6.0 came out.

Secondly, we now know more exactly what is on each LRU list.
That should greatly reduce unnecessary turnover of the list.

For example, if we know there is no swap space available, we
will not bother scanning the anon LRU lists.

If we know there is not enough file cache left to get us up
to the zone high water mark, we will not bother scanning the
few remaining file pages.

Because of those simple checks (in get_scan_priority), I do
not expect that we will have to scan through all of memory
as frequently as we had to do in 2.6.0.

Furthermore, we unconditionally deactivate most active pages
and have a working used-once scheme for pages on the anon
lists.  This should also contribute to a reduction in the
number of pages that get scanned.

>> In that case, the prev_priority logic may have introduced the
>> kind of behavioural bug you describe above...
>>
>>> And one has to wonder: if we're making these incorrect decisions based
>>> upon a bogus view of the current scanning difficulty, why are these
>>> various priority-based thresholding heuristics even in there?  Are they
>>> doing anything useful?
>>
>> The prev_priority code was useful when we had filesystem and
>> swap backed pages mixed on the same LRU list.
>
> No, stop saying swap! ;)
>
> It's all to do with mapped pagecache versus unmapped pagecache.  "ytf
> does my browser get paged out all the time".

We have other measures in place now to protect the working set
on the file LRU lists (see above).  We are able to have those
measures in the kernel because we no longer have mixed LRU
lists.

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-06-17  0:35 UTC|newest]

Thread overview: 198+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-14 11:17 [PATCH 0/12] Avoid overflowing of stack during page reclaim V2 Mel Gorman
2010-06-14 11:17 ` Mel Gorman
2010-06-14 11:17 ` [PATCH 01/12] tracing, vmscan: Add trace events for kswapd wakeup, sleeping and direct reclaim Mel Gorman
2010-06-14 11:17   ` Mel Gorman
2010-06-14 15:45   ` Rik van Riel
2010-06-14 15:45     ` Rik van Riel
2010-06-14 21:01   ` Larry Woodman
2010-06-14 21:01     ` Larry Woodman
2010-06-14 11:17 ` [PATCH 02/12] tracing, vmscan: Add trace events for LRU page isolation Mel Gorman
2010-06-14 11:17   ` Mel Gorman
2010-06-14 16:47   ` Rik van Riel
2010-06-14 16:47     ` Rik van Riel
2010-06-14 21:02   ` Larry Woodman
2010-06-14 21:02     ` Larry Woodman
2010-06-14 11:17 ` [PATCH 03/12] tracing, vmscan: Add trace event when a page is written Mel Gorman
2010-06-14 11:17   ` Mel Gorman
2010-06-14 16:48   ` Rik van Riel
2010-06-14 16:48     ` Rik van Riel
2010-06-14 21:02   ` Larry Woodman
2010-06-14 21:02     ` Larry Woodman
2010-06-14 11:17 ` [PATCH 04/12] tracing, vmscan: Add a postprocessing script for reclaim-related ftrace events Mel Gorman
2010-06-14 11:17   ` Mel Gorman
2010-06-14 17:55   ` Rik van Riel
2010-06-14 17:55     ` Rik van Riel
2010-06-14 21:03   ` Larry Woodman
2010-06-14 21:03     ` Larry Woodman
2010-06-14 11:17 ` [PATCH 05/12] vmscan: kill prev_priority completely Mel Gorman
2010-06-14 11:17   ` Mel Gorman
2010-06-14 18:04   ` Rik van Riel
2010-06-14 18:04     ` Rik van Riel
2010-06-16 23:37   ` Andrew Morton
2010-06-16 23:37     ` Andrew Morton
2010-06-16 23:45     ` Rik van Riel
2010-06-16 23:45       ` Rik van Riel
2010-06-17  0:18       ` Andrew Morton
2010-06-17  0:18         ` Andrew Morton
2010-06-17  0:34         ` Rik van Riel [this message]
2010-06-17  0:34           ` Rik van Riel
2010-06-25  8:29     ` KOSAKI Motohiro
2010-06-25  8:29       ` KOSAKI Motohiro
2010-06-28 10:35       ` Mel Gorman
2010-06-28 10:35         ` Mel Gorman
2010-06-14 11:17 ` [PATCH 06/12] vmscan: simplify shrink_inactive_list() Mel Gorman
2010-06-14 11:17   ` Mel Gorman
2010-06-14 18:06   ` Rik van Riel
2010-06-14 18:06     ` Rik van Riel
2010-06-15 10:13     ` Mel Gorman
2010-06-15 10:13       ` Mel Gorman
2010-06-14 11:17 ` [PATCH 07/12] vmscan: Remove unnecessary temporary vars in do_try_to_free_pages Mel Gorman
2010-06-14 11:17   ` Mel Gorman
2010-06-14 18:14   ` Rik van Riel
2010-06-14 18:14     ` Rik van Riel
2010-06-14 11:17 ` [PATCH 08/12] vmscan: Setup pagevec as late as possible in shrink_inactive_list() Mel Gorman
2010-06-14 11:17   ` Mel Gorman
2010-06-14 18:59   ` Rik van Riel
2010-06-14 18:59     ` Rik van Riel
2010-06-15 10:47   ` Christoph Hellwig
2010-06-15 10:47     ` Christoph Hellwig
2010-06-15 15:56     ` Mel Gorman
2010-06-15 15:56       ` Mel Gorman
2010-06-16 23:43   ` Andrew Morton
2010-06-16 23:43     ` Andrew Morton
2010-06-17 10:30     ` Mel Gorman
2010-06-17 10:30       ` Mel Gorman
2010-06-14 11:17 ` [PATCH 09/12] vmscan: Setup pagevec as late as possible in shrink_page_list() Mel Gorman
2010-06-14 11:17   ` Mel Gorman
2010-06-14 19:24   ` Rik van Riel
2010-06-14 19:24     ` Rik van Riel
2010-06-16 23:48   ` Andrew Morton
2010-06-16 23:48     ` Andrew Morton
2010-06-17 10:46     ` Mel Gorman
2010-06-17 10:46       ` Mel Gorman
2010-06-14 11:17 ` [PATCH 10/12] vmscan: Update isolated page counters outside of main path in shrink_inactive_list() Mel Gorman
2010-06-14 11:17   ` Mel Gorman
2010-06-14 19:42   ` Rik van Riel
2010-06-14 19:42     ` Rik van Riel
2010-06-14 11:17 ` [PATCH 11/12] vmscan: Write out dirty pages in batch Mel Gorman
2010-06-14 11:17   ` Mel Gorman
2010-06-14 21:13   ` Rik van Riel
2010-06-14 21:13     ` Rik van Riel
2010-06-15 10:18     ` Mel Gorman
2010-06-15 10:18       ` Mel Gorman
2010-06-14 23:11   ` Dave Chinner
2010-06-14 23:11     ` Dave Chinner
2010-06-14 23:21     ` Andrew Morton
2010-06-14 23:21       ` Andrew Morton
2010-06-15  0:39       ` Dave Chinner
2010-06-15  0:39         ` Dave Chinner
2010-06-15  1:16         ` Rik van Riel
2010-06-15  1:16           ` Rik van Riel
2010-06-15  1:45           ` Andrew Morton
2010-06-15  1:45             ` Andrew Morton
2010-06-15  4:08             ` Rik van Riel
2010-06-15  4:08               ` Rik van Riel
2010-06-15  4:37               ` Andrew Morton
2010-06-15  4:37                 ` Andrew Morton
2010-06-15  5:12                 ` Nick Piggin
2010-06-15  5:12                   ` Nick Piggin
2010-06-15  5:43                   ` [patch] mm: vmscan fix mapping use after free Nick Piggin
2010-06-15  5:43                     ` Nick Piggin
2010-06-15 13:23                     ` Mel Gorman
2010-06-15 13:23                       ` Mel Gorman
2010-06-15 11:01           ` [PATCH 11/12] vmscan: Write out dirty pages in batch Christoph Hellwig
2010-06-15 11:01             ` Christoph Hellwig
2010-06-15 13:32             ` Rik van Riel
2010-06-15 13:32               ` Rik van Riel
2010-06-15  1:39         ` Andrew Morton
2010-06-15  1:39           ` Andrew Morton
2010-06-15  3:20           ` Dave Chinner
2010-06-15  3:20             ` Dave Chinner
2010-06-15  4:15             ` Andrew Morton
2010-06-15  4:15               ` Andrew Morton
2010-06-15  6:36               ` Dave Chinner
2010-06-15  6:36                 ` Dave Chinner
2010-06-15 10:28                 ` Evgeniy Polyakov
2010-06-15 10:28                   ` Evgeniy Polyakov
2010-06-15 10:55                   ` Nick Piggin
2010-06-15 10:55                     ` Nick Piggin
2010-06-15 11:10                     ` Christoph Hellwig
2010-06-15 11:10                       ` Christoph Hellwig
2010-06-15 11:20                       ` Nick Piggin
2010-06-15 11:20                         ` Nick Piggin
2010-06-15 23:20                     ` Dave Chinner
2010-06-15 23:20                       ` Dave Chinner
2010-06-16  6:04                       ` Nick Piggin
2010-06-16  6:04                         ` Nick Piggin
2010-06-15 11:08                   ` Christoph Hellwig
2010-06-15 11:08                     ` Christoph Hellwig
2010-06-15 11:43               ` Mel Gorman
2010-06-15 11:43                 ` Mel Gorman
2010-06-15 13:07                 ` tytso
2010-06-15 13:07                   ` tytso
2010-06-15 15:44                 ` Mel Gorman
2010-06-15 15:44                   ` Mel Gorman
2010-06-15 10:57       ` Christoph Hellwig
2010-06-15 10:57         ` Christoph Hellwig
2010-06-15 10:53   ` Christoph Hellwig
2010-06-15 10:53     ` Christoph Hellwig
2010-06-15 11:11     ` Mel Gorman
2010-06-15 11:11       ` Mel Gorman
2010-06-15 11:13     ` Nick Piggin
2010-06-15 11:13       ` Nick Piggin
2010-06-14 11:17 ` [PATCH 12/12] vmscan: Do not writeback pages in direct reclaim Mel Gorman
2010-06-14 11:17   ` Mel Gorman
2010-06-14 21:55   ` Rik van Riel
2010-06-14 21:55     ` Rik van Riel
2010-06-15 11:45     ` Mel Gorman
2010-06-15 11:45       ` Mel Gorman
2010-06-15 13:34       ` Rik van Riel
2010-06-15 13:34         ` Rik van Riel
2010-06-15 13:37         ` Christoph Hellwig
2010-06-15 13:37           ` Christoph Hellwig
2010-06-15 13:54           ` Mel Gorman
2010-06-15 13:54             ` Mel Gorman
2010-06-16  0:30             ` KAMEZAWA Hiroyuki
2010-06-16  0:30               ` KAMEZAWA Hiroyuki
2010-06-15 14:02           ` Rik van Riel
2010-06-15 14:02             ` Rik van Riel
2010-06-15 13:59         ` Mel Gorman
2010-06-15 13:59           ` Mel Gorman
2010-06-15 14:04           ` Rik van Riel
2010-06-15 14:04             ` Rik van Riel
2010-06-15 14:16             ` Mel Gorman
2010-06-15 14:16               ` Mel Gorman
2010-06-16  0:17               ` KAMEZAWA Hiroyuki
2010-06-16  0:17                 ` KAMEZAWA Hiroyuki
2010-06-16  0:29                 ` Rik van Riel
2010-06-16  0:29                   ` Rik van Riel
2010-06-16  0:39                   ` KAMEZAWA Hiroyuki
2010-06-16  0:39                     ` KAMEZAWA Hiroyuki
2010-06-16  0:53                     ` Rik van Riel
2010-06-16  0:53                       ` Rik van Riel
2010-06-16  1:40                       ` KAMEZAWA Hiroyuki
2010-06-16  1:40                         ` KAMEZAWA Hiroyuki
2010-06-16  2:20                         ` KAMEZAWA Hiroyuki
2010-06-16  2:20                           ` KAMEZAWA Hiroyuki
2010-06-16  5:11                           ` Christoph Hellwig
2010-06-16  5:11                             ` Christoph Hellwig
2010-06-16 10:51                             ` Jens Axboe
2010-06-16 10:51                               ` Jens Axboe
2010-06-16  5:07                     ` Christoph Hellwig
2010-06-16  5:07                       ` Christoph Hellwig
2010-06-16  5:06                 ` Christoph Hellwig
2010-06-16  5:06                   ` Christoph Hellwig
2010-06-17  0:25                   ` KAMEZAWA Hiroyuki
2010-06-17  0:25                     ` KAMEZAWA Hiroyuki
2010-06-17  6:16                     ` Christoph Hellwig
2010-06-17  6:16                       ` Christoph Hellwig
2010-06-17  6:23                       ` KAMEZAWA Hiroyuki
2010-06-17  6:23                         ` KAMEZAWA Hiroyuki
2010-06-14 15:10 ` [PATCH 0/12] Avoid overflowing of stack during page reclaim V2 Christoph Hellwig
2010-06-14 15:10   ` Christoph Hellwig
2010-06-15 11:45   ` Mel Gorman
2010-06-15 11:45     ` Mel Gorman
2010-06-15  0:08 ` KAMEZAWA Hiroyuki
2010-06-15  0:08   ` KAMEZAWA Hiroyuki
2010-06-15 11:49   ` Mel Gorman
2010-06-15 11:49     ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C196D81.8090700@redhat.com \
    --to=riel@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=npiggin@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.