linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/vmscan: Prioritize anonymous executable pages like we do file-backed
@ 2020-03-04 20:32 Adam Jackson
  2020-03-04 20:38 ` Matthew Wilcox
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Adam Jackson @ 2020-03-04 20:32 UTC (permalink / raw)
  To: linux-mm

The page reclamation scanner tries to keep executable pages resident,
since taking a hard page fault to satisfy an icache miss is really not
great for interactivity.  Anonymous executable pages tend to contain
code that has been just-in-time compiled for performance reasons. By
requiring that executable pages be file-backed, we're putting possibly
the most performance-sensitive code at higher risk of eviction, which
seems backwards.

On an amd64 machine running Fedora 31, the firefox I happen to have
running requires about 89M of file-backed text and 12M of anonymous text
for 30 open tabs. The next largest process in terms of anonymous text is
gnome-shell, with 1M anonymous and 57M file-backed. No other process had
significant anonymous text, most had none. Penalizing those 13M
specifically when under memory pressure seems like an easy hazard to
avoid.

Signed-off-by: Adam Jackson <ajax@redhat.com>
---
 mm/vmscan.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index ee4eecc7e1c2..9bfbc30d61d8 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2095,15 +2095,12 @@ static void shrink_active_list(unsigned long nr_to_scan,
 				    &vm_flags)) {
 			nr_rotated += hpage_nr_pages(page);
 			/*
-			 * Identify referenced, file-backed active pages and
-			 * give them one more trip around the active list. So
+			 * Identify referenced, executable active pages and
+			 * give them one more trip around the active list, so
 			 * that executable code get better chances to stay in
-			 * memory under moderate memory pressure.  Anon pages
-			 * are not likely to be evicted by use-once streaming
-			 * IO, plus JVM can create lots of anon VM_EXEC pages,
-			 * so we ignore them here.
+			 * memory under moderate memory pressure.
 			 */
-			if ((vm_flags & VM_EXEC) && page_is_file_cache(page)) {
+			if ((vm_flags & VM_EXEC)) {
 				list_add(&page->lru, &l_active);
 				continue;
 			}
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/vmscan: Prioritize anonymous executable pages like we do file-backed
  2020-03-04 20:32 [PATCH] mm/vmscan: Prioritize anonymous executable pages like we do file-backed Adam Jackson
@ 2020-03-04 20:38 ` Matthew Wilcox
  2020-03-05 12:47 ` Vlastimil Babka
  2020-03-05 15:17 ` Michal Hocko
  2 siblings, 0 replies; 8+ messages in thread
From: Matthew Wilcox @ 2020-03-04 20:38 UTC (permalink / raw)
  To: Adam Jackson; +Cc: linux-mm

On Wed, Mar 04, 2020 at 03:32:35PM -0500, Adam Jackson wrote:
> -			if ((vm_flags & VM_EXEC) && page_is_file_cache(page)) {
> +			if ((vm_flags & VM_EXEC)) {

You can drop one set of parens here


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/vmscan: Prioritize anonymous executable pages like we do file-backed
  2020-03-04 20:32 [PATCH] mm/vmscan: Prioritize anonymous executable pages like we do file-backed Adam Jackson
  2020-03-04 20:38 ` Matthew Wilcox
@ 2020-03-05 12:47 ` Vlastimil Babka
  2020-03-05 20:41   ` Shakeel Butt
  2020-03-05 15:17 ` Michal Hocko
  2 siblings, 1 reply; 8+ messages in thread
From: Vlastimil Babka @ 2020-03-05 12:47 UTC (permalink / raw)
  To: Adam Jackson, linux-mm
  Cc: Johannes Weiner, Shakeel Butt, Roman Gushchin, Joonsoo Kim,
	Michal Hocko, Mel Gorman

+ CC folks who focus on reclaim

On 3/4/20 9:32 PM, Adam Jackson wrote:
> The page reclamation scanner tries to keep executable pages resident,
> since taking a hard page fault to satisfy an icache miss is really not
> great for interactivity.  Anonymous executable pages tend to contain
> code that has been just-in-time compiled for performance reasons. By
> requiring that executable pages be file-backed, we're putting possibly
> the most performance-sensitive code at higher risk of eviction, which
> seems backwards.
> 
> On an amd64 machine running Fedora 31, the firefox I happen to have
> running requires about 89M of file-backed text and 12M of anonymous text
> for 30 open tabs. The next largest process in terms of anonymous text is
> gnome-shell, with 1M anonymous and 57M file-backed. No other process had
> significant anonymous text, most had none. Penalizing those 13M
> specifically when under memory pressure seems like an easy hazard to
> avoid.
> 
> Signed-off-by: Adam Jackson <ajax@redhat.com>
> ---
>  mm/vmscan.c | 11 ++++-------
>  1 file changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index ee4eecc7e1c2..9bfbc30d61d8 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2095,15 +2095,12 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  				    &vm_flags)) {
>  			nr_rotated += hpage_nr_pages(page);
>  			/*
> -			 * Identify referenced, file-backed active pages and
> -			 * give them one more trip around the active list. So
> +			 * Identify referenced, executable active pages and
> +			 * give them one more trip around the active list, so
>  			 * that executable code get better chances to stay in
> -			 * memory under moderate memory pressure.  Anon pages
> -			 * are not likely to be evicted by use-once streaming
> -			 * IO, plus JVM can create lots of anon VM_EXEC pages,
> -			 * so we ignore them here.
> +			 * memory under moderate memory pressure.
>  			 */
> -			if ((vm_flags & VM_EXEC) && page_is_file_cache(page)) {
> +			if ((vm_flags & VM_EXEC)) {
>  				list_add(&page->lru, &l_active);
>  				continue;
>  			}
> 



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/vmscan: Prioritize anonymous executable pages like we do file-backed
  2020-03-04 20:32 [PATCH] mm/vmscan: Prioritize anonymous executable pages like we do file-backed Adam Jackson
  2020-03-04 20:38 ` Matthew Wilcox
  2020-03-05 12:47 ` Vlastimil Babka
@ 2020-03-05 15:17 ` Michal Hocko
  2020-03-05 18:05   ` Adam Jackson
  2 siblings, 1 reply; 8+ messages in thread
From: Michal Hocko @ 2020-03-05 15:17 UTC (permalink / raw)
  To: Adam Jackson; +Cc: linux-mm

On Wed 04-03-20 15:32:35, Adam Jackson wrote:
> The page reclamation scanner tries to keep executable pages resident,
> since taking a hard page fault to satisfy an icache miss is really not
> great for interactivity.  Anonymous executable pages tend to contain
> code that has been just-in-time compiled for performance reasons. By
> requiring that executable pages be file-backed, we're putting possibly
> the most performance-sensitive code at higher risk of eviction, which
> seems backwards.
> 
> On an amd64 machine running Fedora 31, the firefox I happen to have
> running requires about 89M of file-backed text and 12M of anonymous text
> for 30 open tabs. The next largest process in terms of anonymous text is
> gnome-shell, with 1M anonymous and 57M file-backed. No other process had
> significant anonymous text, most had none. Penalizing those 13M
> specifically when under memory pressure seems like an easy hazard to
> avoid.

Are you seeing an actual improvement from this change? IIRC the primary
motivation to make this heuristic page cache oriented is that it was
quite easy to evict file backed memory by streaming IO. This shouldn't
really be a major problem for the anonymous memory in most cases. A
heavy swapin/out workload is likely to suffer from not having data
available more than having the code evicted. But I might be wrong here
and getting some numbers would be really interesting.

> Signed-off-by: Adam Jackson <ajax@redhat.com>
> ---
>  mm/vmscan.c | 11 ++++-------
>  1 file changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index ee4eecc7e1c2..9bfbc30d61d8 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2095,15 +2095,12 @@ static void shrink_active_list(unsigned long nr_to_scan,
>  				    &vm_flags)) {
>  			nr_rotated += hpage_nr_pages(page);
>  			/*
> -			 * Identify referenced, file-backed active pages and
> -			 * give them one more trip around the active list. So
> +			 * Identify referenced, executable active pages and
> +			 * give them one more trip around the active list, so
>  			 * that executable code get better chances to stay in
> -			 * memory under moderate memory pressure.  Anon pages
> -			 * are not likely to be evicted by use-once streaming
> -			 * IO, plus JVM can create lots of anon VM_EXEC pages,
> -			 * so we ignore them here.
> +			 * memory under moderate memory pressure.
>  			 */
> -			if ((vm_flags & VM_EXEC) && page_is_file_cache(page)) {
> +			if ((vm_flags & VM_EXEC)) {
>  				list_add(&page->lru, &l_active);
>  				continue;
>  			}
> -- 
> 2.23.0
> 

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/vmscan: Prioritize anonymous executable pages like we do file-backed
  2020-03-05 15:17 ` Michal Hocko
@ 2020-03-05 18:05   ` Adam Jackson
  0 siblings, 0 replies; 8+ messages in thread
From: Adam Jackson @ 2020-03-05 18:05 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-mm

On Thu, 2020-03-05 at 16:17 +0100, Michal Hocko wrote:
> On Wed 04-03-20 15:32:35, Adam Jackson wrote:
> > The page reclamation scanner tries to keep executable pages resident,
> > since taking a hard page fault to satisfy an icache miss is really not
> > great for interactivity.  Anonymous executable pages tend to contain
> > code that has been just-in-time compiled for performance reasons. By
> > requiring that executable pages be file-backed, we're putting possibly
> > the most performance-sensitive code at higher risk of eviction, which
> > seems backwards.
> > 
> > On an amd64 machine running Fedora 31, the firefox I happen to have
> > running requires about 89M of file-backed text and 12M of anonymous text
> > for 30 open tabs. The next largest process in terms of anonymous text is
> > gnome-shell, with 1M anonymous and 57M file-backed. No other process had
> > significant anonymous text, most had none. Penalizing those 13M
> > specifically when under memory pressure seems like an easy hazard to
> > avoid.
> 
> Are you seeing an actual improvement from this change? IIRC the primary
> motivation to make this heuristic page cache oriented is that it was
> quite easy to evict file backed memory by streaming IO. This shouldn't
> really be a major problem for the anonymous memory in most cases. A
> heavy swapin/out workload is likely to suffer from not having data
> available more than having the code evicted. But I might be wrong here
> and getting some numbers would be really interesting.

They would be! I confess I don't have any, I'll see if I can gather
some for you. The problem case for this patch is maybe as much about
streaming I/O as it's about just giant wads of dirty data. Given enough
pressure, eventually this loop will prefer to evict jitted code before
precompiled. I find that to be a difficult preference to justify. If
anything maybe you should evict the file-backed code first because you
can discard it instead of writing it out (assuming your swap and
executables are on comparably fast disk, etc).

That's my theory anyway. It's entirely possible I don't understand the
larger environment for this code, and like I say I don't have hard data
yet. For all I know big data-center-y java jobs might actually want the
existing behavior. But it seemed curious enough to warrant at least
sending the patch for feedback.

- ajax



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/vmscan: Prioritize anonymous executable pages like we do file-backed
  2020-03-05 12:47 ` Vlastimil Babka
@ 2020-03-05 20:41   ` Shakeel Butt
  2020-03-05 22:58     ` Adam Jackson
  0 siblings, 1 reply; 8+ messages in thread
From: Shakeel Butt @ 2020-03-05 20:41 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Adam Jackson, Linux MM, Johannes Weiner, Roman Gushchin,
	Joonsoo Kim, Michal Hocko, Mel Gorman

On Thu, Mar 5, 2020 at 4:47 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> + CC folks who focus on reclaim
>
> On 3/4/20 9:32 PM, Adam Jackson wrote:
> > The page reclamation scanner tries to keep executable pages resident,
> > since taking a hard page fault to satisfy an icache miss is really not
> > great for interactivity.  Anonymous executable pages tend to contain
> > code that has been just-in-time compiled for performance reasons. By
> > requiring that executable pages be file-backed, we're putting possibly
> > the most performance-sensitive code at higher risk of eviction, which
> > seems backwards.
> >
> > On an amd64 machine running Fedora 31, the firefox I happen to have
> > running requires about 89M of file-backed text and 12M of anonymous text
> > for 30 open tabs. The next largest process in terms of anonymous text is
> > gnome-shell, with 1M anonymous and 57M file-backed. No other process had
> > significant anonymous text, most had none. Penalizing those 13M
> > specifically when under memory pressure seems like an easy hazard to
> > avoid.

Have you actually seen this issue (i.e. JIT code reclaimed and
thrashing) happening for real workloads?

> >
> > Signed-off-by: Adam Jackson <ajax@redhat.com>
> > ---
> >  mm/vmscan.c | 11 ++++-------
> >  1 file changed, 4 insertions(+), 7 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index ee4eecc7e1c2..9bfbc30d61d8 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -2095,15 +2095,12 @@ static void shrink_active_list(unsigned long nr_to_scan,
> >                                   &vm_flags)) {
> >                       nr_rotated += hpage_nr_pages(page);
> >                       /*
> > -                      * Identify referenced, file-backed active pages and
> > -                      * give them one more trip around the active list. So
> > +                      * Identify referenced, executable active pages and
> > +                      * give them one more trip around the active list, so
> >                        * that executable code get better chances to stay in
> > -                      * memory under moderate memory pressure.  Anon pages
> > -                      * are not likely to be evicted by use-once streaming
> > -                      * IO, plus JVM can create lots of anon VM_EXEC pages,
> > -                      * so we ignore them here.
> > +                      * memory under moderate memory pressure.
> >                        */
> > -                     if ((vm_flags & VM_EXEC) && page_is_file_cache(page)) {

Originally this heuristic was added to protect executable file pages
from the streaming workloads. Now we have workingset detection for
file pages and there is ongoing work for adding that support for anon
pages, I am wondering if this specific heuristic is still helpful.

> > +                     if ((vm_flags & VM_EXEC)) {
> >                               list_add(&page->lru, &l_active);
> >                               continue;
> >                       }
> >
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/vmscan: Prioritize anonymous executable pages like we do file-backed
  2020-03-05 20:41   ` Shakeel Butt
@ 2020-03-05 22:58     ` Adam Jackson
  2020-03-06  9:22       ` Mel Gorman
  0 siblings, 1 reply; 8+ messages in thread
From: Adam Jackson @ 2020-03-05 22:58 UTC (permalink / raw)
  To: Shakeel Butt, Vlastimil Babka
  Cc: Linux MM, Johannes Weiner, Roman Gushchin, Joonsoo Kim,
	Michal Hocko, Mel Gorman

On Thu, 2020-03-05 at 12:41 -0800, Shakeel Butt wrote:

> Have you actually seen this issue (i.e. JIT code reclaimed and
> thrashing) happening for real workloads?

The mention of gnome-shell wasn't entirely an accident, it's largely
written in javascript which is jit-compiled. Under sufficiently dire
disk I/O conditions merely moving the cursor would lag. That ought to
be impossible, because nothing in the cursor update path ought to be
blocking on the disk and we believe the CPU scheduler to be competent.
Plenty of bugs are contributing here, but among them yes the (jitted
bits of the) cursor update code can get paged out while megabytes of
file-backed text stay resident. It's certainly drowned out by the
contribution of the plenty of other bugs involved, but still.

> Originally this heuristic was added to protect executable file pages
> from the streaming workloads. Now we have workingset detection for
> file pages and there is ongoing work for adding that support for anon
> pages, I am wondering if this specific heuristic is still helpful.

Enh. I would tend to think that code is way more precious than data in
terms of staying resident. If the working set detection works well
enough to come to that conclusion on its own without explictly knowing
about executable pages, that'd be awesome and I'd be entirely fine with
removing even more of this heuristic.

- ajax



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/vmscan: Prioritize anonymous executable pages like we do file-backed
  2020-03-05 22:58     ` Adam Jackson
@ 2020-03-06  9:22       ` Mel Gorman
  0 siblings, 0 replies; 8+ messages in thread
From: Mel Gorman @ 2020-03-06  9:22 UTC (permalink / raw)
  To: Adam Jackson
  Cc: Shakeel Butt, Vlastimil Babka, Linux MM, Johannes Weiner,
	Roman Gushchin, Joonsoo Kim, Michal Hocko

On Thu, Mar 05, 2020 at 05:58:59PM -0500, Adam Jackson wrote:
> > Originally this heuristic was added to protect executable file pages
> > from the streaming workloads. Now we have workingset detection for
> > file pages and there is ongoing work for adding that support for anon
> > pages, I am wondering if this specific heuristic is still helpful.
> 
> Enh. I would tend to think that code is way more precious than data in
> terms of staying resident. If the working set detection works well
> enough to come to that conclusion on its own without explictly knowing
> about executable pages, that'd be awesome and I'd be entirely fine with
> removing even more of this heuristic.
> 

Given the increased use of JIT engines, the existence of working set
detection and the fact it may also work for anonymous pages soon, I
think it's worth at least *trying* to remove the heuristic in case it
stays around for years as magic code.

Creating an automated test case for this would be relatively tricky. Could
you put together a debugging patch that simply counts some events to put
into the changelog? The events (which could be vmstat) would be

o VM_EXEC pages encountered in reclaim
o Number exec file-backed pages preserved
o Number exec anon pages preserved
o Number exec file-backed pages reclaimed
o NUmber exec anon pages reclaimed

And show the figures before and after in the changelog running Firefox
with excessive IO in the background. It's a bit of legwork but it's to
preserve in the changelog that this problem definitely happens and the
patch has a positive impact. Some comment saying the cursor is not laggy
with the patch applied would also be a plus. The debugging patch would
not actually be merged.

With the figures, if there ever is a bug report that bisects to this patch,
it'll be known exactly what the impact and motivation was. That will at
least make people pause and think before blindly reverting it.

I'm guessing the impact is that the ratio of reclaimed/preserved for anon
pages is currently skewed and after the patch it's more in line with the
ratio for file-backed. It's a tough prediction as the size of the file
vs anon LRUs at the time of reclaim matter as well as the ordering of
pages in the LRU.

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-03-06  9:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-04 20:32 [PATCH] mm/vmscan: Prioritize anonymous executable pages like we do file-backed Adam Jackson
2020-03-04 20:38 ` Matthew Wilcox
2020-03-05 12:47 ` Vlastimil Babka
2020-03-05 20:41   ` Shakeel Butt
2020-03-05 22:58     ` Adam Jackson
2020-03-06  9:22       ` Mel Gorman
2020-03-05 15:17 ` Michal Hocko
2020-03-05 18:05   ` Adam Jackson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).