All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Mel Gorman <mgorman@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	Linux-FSDevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 4/5] mm: page_alloc: Reduce cost of the fair zone allocation policy
Date: Mon, 30 Jun 2014 10:41:42 -0400	[thread overview]
Message-ID: <20140630144142.GA1369@cmpxchg.org> (raw)
In-Reply-To: <20140627192537.GM10819@suse.de>

On Fri, Jun 27, 2014 at 08:25:37PM +0100, Mel Gorman wrote:
> On Fri, Jun 27, 2014 at 02:57:00PM -0400, Johannes Weiner wrote:
> > On Fri, Jun 27, 2014 at 09:14:39AM +0100, Mel Gorman wrote:
> > > And the number of pages allocated from each zone is comparable
> > > 
> > >                             3.16.0-rc2  3.16.0-rc2
> > >                               checklow    fairzone
> > > DMA allocs                           0           0
> > > DMA32 allocs                   7374217     7920241
> > > Normal allocs                999277551   996568115
> > 
> > Wow, the DMA32 zone gets less than 1% of the allocations.  What are
> > the zone sizes in this machine?
> > 
> 
>         managed  3976
>         managed  755409
>         managed  1281601

Something seems way off with this.  On my system here, the DMA32 zone
makes up for 20% of managed pages and it gets roughly 20% of the page
allocations, as I would expect.

Your DMA32 zone makes up for 37% of the managed pages and receives
merely 0.7% of the page allocations.  Unless a large portion of that
zone is somehow unreclaimable, fairness seems completely obliberated
in both kernels.

Is that checklow's doing?

> > > @@ -3287,10 +3287,18 @@ void show_free_areas(unsigned int filter)
> > >  	show_swap_cache_info();
> > >  }
> > >  
> > > -static void zoneref_set_zone(struct zone *zone, struct zoneref *zoneref)
> > > +static int zoneref_set_zone(pg_data_t *pgdat, struct zone *zone,
> > > +			struct zoneref *zoneref, struct zone *preferred_zone)
> > >  {
> > > +	int zone_type = zone_idx(zone);
> > > +	bool fair_enabled = zone_local(zone, preferred_zone);
> > > +	if (zone_type == 0 &&
> > > +			zone->managed_pages < (pgdat->node_present_pages >> 4))
> > > +		fair_enabled = false;
> > 
> > This needs a comment.
> > 
> 
>         /*
>          * Do not count the lowest zone as of relevance to the fair zone
>          * allocation policy if it's a small percentage of the node
>          */
> 
> However, as I write this I'll look at getting rid of this entirely. It
> made some sense when fair_eligible was tracked on a per-zone basis but
> it's more complex than necessary.
>
> > >  	zoneref->zone = zone;
> > > -	zoneref->zone_idx = zone_idx(zone);
> > > +	zoneref->zone_idx = zone_type;
> > > +	return fair_enabled;
> > >  }
> > >  
> > >  /*
> > > @@ -3303,17 +3311,26 @@ static int build_zonelists_node(pg_data_t *pgdat, struct zonelist *zonelist,
> > >  {
> > >  	struct zone *zone;
> > >  	enum zone_type zone_type = MAX_NR_ZONES;
> > > +	struct zone *preferred_zone = NULL;
> > > +	int nr_fair = 0;
> > >  
> > >  	do {
> > >  		zone_type--;
> > >  		zone = pgdat->node_zones + zone_type;
> > >  		if (populated_zone(zone)) {
> > > -			zoneref_set_zone(zone,
> > > -				&zonelist->_zonerefs[nr_zones++]);
> > > +			if (!preferred_zone)
> > > +				preferred_zone = zone;
> > > +
> > > +			nr_fair += zoneref_set_zone(pgdat, zone,
> > > +				&zonelist->_zonerefs[nr_zones++],
> > > +				preferred_zone);
> > 
> > Passing preferred_zone to determine locality seems pointless when you
> > walk the zones of a single node.
> > 
> 
> True.
> 
> > And the return value of zoneref_set_zone() is fairly unexpected.
> > 
> 
> How so?

Given the name zoneref_set_zone(), I wouldn't expect any return value,
or a success/failure type return value at best - certainly not whether
the passed zone is eligible for the fairness policy.

> > It's probably better to determine fair_enabled in the callsite, that
> > would fix both problems, and write a separate helper that tests if a
> > zone is eligible for fair treatment (type && managed_pages test).
> > 
> 
> Are you thinking of putting that into the page allocator fast path? I'm
> trying to take stuff out of there :/.

Not at all, I was just suggesting to restructure the code for building
the zonelists, and move the fairness stuff out of zoneref_set_zone().

If you remove the small-zone exclusion as per above, this only leaves
the locality check when building the zonelist in zone order and that
can easily be checked inline in build_zonelists_in_zone_order().

build_zonelists_node() can just count every populated zone in nr_fair.

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Mel Gorman <mgorman@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	Linux-FSDevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 4/5] mm: page_alloc: Reduce cost of the fair zone allocation policy
Date: Mon, 30 Jun 2014 10:41:42 -0400	[thread overview]
Message-ID: <20140630144142.GA1369@cmpxchg.org> (raw)
In-Reply-To: <20140627192537.GM10819@suse.de>

On Fri, Jun 27, 2014 at 08:25:37PM +0100, Mel Gorman wrote:
> On Fri, Jun 27, 2014 at 02:57:00PM -0400, Johannes Weiner wrote:
> > On Fri, Jun 27, 2014 at 09:14:39AM +0100, Mel Gorman wrote:
> > > And the number of pages allocated from each zone is comparable
> > > 
> > >                             3.16.0-rc2  3.16.0-rc2
> > >                               checklow    fairzone
> > > DMA allocs                           0           0
> > > DMA32 allocs                   7374217     7920241
> > > Normal allocs                999277551   996568115
> > 
> > Wow, the DMA32 zone gets less than 1% of the allocations.  What are
> > the zone sizes in this machine?
> > 
> 
>         managed  3976
>         managed  755409
>         managed  1281601

Something seems way off with this.  On my system here, the DMA32 zone
makes up for 20% of managed pages and it gets roughly 20% of the page
allocations, as I would expect.

Your DMA32 zone makes up for 37% of the managed pages and receives
merely 0.7% of the page allocations.  Unless a large portion of that
zone is somehow unreclaimable, fairness seems completely obliberated
in both kernels.

Is that checklow's doing?

> > > @@ -3287,10 +3287,18 @@ void show_free_areas(unsigned int filter)
> > >  	show_swap_cache_info();
> > >  }
> > >  
> > > -static void zoneref_set_zone(struct zone *zone, struct zoneref *zoneref)
> > > +static int zoneref_set_zone(pg_data_t *pgdat, struct zone *zone,
> > > +			struct zoneref *zoneref, struct zone *preferred_zone)
> > >  {
> > > +	int zone_type = zone_idx(zone);
> > > +	bool fair_enabled = zone_local(zone, preferred_zone);
> > > +	if (zone_type == 0 &&
> > > +			zone->managed_pages < (pgdat->node_present_pages >> 4))
> > > +		fair_enabled = false;
> > 
> > This needs a comment.
> > 
> 
>         /*
>          * Do not count the lowest zone as of relevance to the fair zone
>          * allocation policy if it's a small percentage of the node
>          */
> 
> However, as I write this I'll look at getting rid of this entirely. It
> made some sense when fair_eligible was tracked on a per-zone basis but
> it's more complex than necessary.
>
> > >  	zoneref->zone = zone;
> > > -	zoneref->zone_idx = zone_idx(zone);
> > > +	zoneref->zone_idx = zone_type;
> > > +	return fair_enabled;
> > >  }
> > >  
> > >  /*
> > > @@ -3303,17 +3311,26 @@ static int build_zonelists_node(pg_data_t *pgdat, struct zonelist *zonelist,
> > >  {
> > >  	struct zone *zone;
> > >  	enum zone_type zone_type = MAX_NR_ZONES;
> > > +	struct zone *preferred_zone = NULL;
> > > +	int nr_fair = 0;
> > >  
> > >  	do {
> > >  		zone_type--;
> > >  		zone = pgdat->node_zones + zone_type;
> > >  		if (populated_zone(zone)) {
> > > -			zoneref_set_zone(zone,
> > > -				&zonelist->_zonerefs[nr_zones++]);
> > > +			if (!preferred_zone)
> > > +				preferred_zone = zone;
> > > +
> > > +			nr_fair += zoneref_set_zone(pgdat, zone,
> > > +				&zonelist->_zonerefs[nr_zones++],
> > > +				preferred_zone);
> > 
> > Passing preferred_zone to determine locality seems pointless when you
> > walk the zones of a single node.
> > 
> 
> True.
> 
> > And the return value of zoneref_set_zone() is fairly unexpected.
> > 
> 
> How so?

Given the name zoneref_set_zone(), I wouldn't expect any return value,
or a success/failure type return value at best - certainly not whether
the passed zone is eligible for the fairness policy.

> > It's probably better to determine fair_enabled in the callsite, that
> > would fix both problems, and write a separate helper that tests if a
> > zone is eligible for fair treatment (type && managed_pages test).
> > 
> 
> Are you thinking of putting that into the page allocator fast path? I'm
> trying to take stuff out of there :/.

Not at all, I was just suggesting to restructure the code for building
the zonelists, and move the fairness stuff out of zoneref_set_zone().

If you remove the small-zone exclusion as per above, this only leaves
the locality check when building the zonelist in zone order and that
can easily be checked inline in build_zonelists_in_zone_order().

build_zonelists_node() can just count every populated zone in nr_fair.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-06-30 14:41 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-27  8:14 [PATCH 0/5] Improve sequential read throughput v3 Mel Gorman
2014-06-27  8:14 ` Mel Gorman
2014-06-27  8:14 ` [PATCH 1/5] mm: pagemap: Avoid unnecessary overhead when tracepoints are deactivated Mel Gorman
2014-06-27  8:14   ` Mel Gorman
2014-06-27  8:14 ` [PATCH 2/5] mm: Rearrange zone fields into read-only, page alloc, statistics and page reclaim lines Mel Gorman
2014-06-27  8:14   ` Mel Gorman
2014-06-27  8:14 ` [PATCH 3/5] mm: vmscan: Do not reclaim from lower zones if they are balanced Mel Gorman
2014-06-27  8:14   ` Mel Gorman
2014-06-27 17:26   ` Johannes Weiner
2014-06-27 17:26     ` Johannes Weiner
2014-06-27 18:42     ` Mel Gorman
2014-06-27 18:42       ` Mel Gorman
2014-06-27  8:14 ` [PATCH 4/5] mm: page_alloc: Reduce cost of the fair zone allocation policy Mel Gorman
2014-06-27  8:14   ` Mel Gorman
2014-06-27 18:57   ` Johannes Weiner
2014-06-27 18:57     ` Johannes Weiner
2014-06-27 19:25     ` Mel Gorman
2014-06-27 19:25       ` Mel Gorman
2014-06-30 14:41       ` Johannes Weiner [this message]
2014-06-30 14:41         ` Johannes Weiner
2014-06-27  8:14 ` [PATCH 5/5] mm: page_alloc: Reduce cost of dirty zone balancing Mel Gorman
2014-06-27  8:14   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140630144142.GA1369@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.