All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	Linux-FSDevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 6/6] mm: page_alloc: Reduce cost of the fair zone allocation policy
Date: Tue, 2 Sep 2014 10:01:16 -0400	[thread overview]
Message-ID: <20140902140116.GD29501@cmpxchg.org> (raw)
In-Reply-To: <53E8B83D.1070004@suse.cz>

On Mon, Aug 11, 2014 at 02:34:05PM +0200, Vlastimil Babka wrote:
> On 08/11/2014 02:12 PM, Mel Gorman wrote:
> >On Fri, Aug 08, 2014 at 05:27:15PM +0200, Vlastimil Babka wrote:
> >>On 07/09/2014 10:13 AM, Mel Gorman wrote:
> >>>--- a/mm/page_alloc.c
> >>>+++ b/mm/page_alloc.c
> >>>@@ -1604,6 +1604,9 @@ again:
> >>>  	}
> >>>
> >>>  	__mod_zone_page_state(zone, NR_ALLOC_BATCH, -(1 << order));
> >>
> >>This can underflow zero, right?
> >>
> >
> >Yes, because of per-cpu accounting drift.
> 
> I meant mainly because of order > 0.
> 
> >>>+	if (zone_page_state(zone, NR_ALLOC_BATCH) == 0 &&
> >>
> >>AFAICS, zone_page_state will correct negative values to zero only for
> >>CONFIG_SMP. Won't this check be broken on !CONFIG_SMP?
> >>
> >
> >On !CONFIG_SMP how can there be per-cpu accounting drift that would make
> >that counter negative?
> 
> Well original code used "if (zone_page_state(zone, NR_ALLOC_BATCH) <= 0)"
> elsewhere, that you are replacing with zone_is_fair_depleted check. I
> assumed it's because it can get negative due to order > 0. I might have not
> looked thoroughly enough but it seems to me there's nothing that would
> prevent it, such as skipping a zone because its remaining batch is lower
> than 1 << order.
> So I think the check should be "<= 0" to be safe.

Any updates on this?

The counter can definitely underflow on !CONFIG_SMP, and then the flag
gets out of sync with the actual batch state.  I'd still prefer just
removing this flag again; it's extra complexity and error prone (case
in point) while the upsides are not even measurable in real life.

---

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 318df7051850..0bd77f730b38 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -534,7 +534,6 @@ typedef enum {
 	ZONE_WRITEBACK,			/* reclaim scanning has recently found
 					 * many pages under writeback
 					 */
-	ZONE_FAIR_DEPLETED,		/* fair zone policy batch depleted */
 } zone_flags_t;
 
 static inline void zone_set_flag(struct zone *zone, zone_flags_t flag)
@@ -572,11 +571,6 @@ static inline int zone_is_reclaim_locked(const struct zone *zone)
 	return test_bit(ZONE_RECLAIM_LOCKED, &zone->flags);
 }
 
-static inline int zone_is_fair_depleted(const struct zone *zone)
-{
-	return test_bit(ZONE_FAIR_DEPLETED, &zone->flags);
-}
-
 static inline int zone_is_oom_locked(const struct zone *zone)
 {
 	return test_bit(ZONE_OOM_LOCKED, &zone->flags);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 18cee0d4c8a2..d913809a328f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1612,9 +1612,6 @@ again:
 	}
 
 	__mod_zone_page_state(zone, NR_ALLOC_BATCH, -(1 << order));
-	if (zone_page_state(zone, NR_ALLOC_BATCH) == 0 &&
-	    !zone_is_fair_depleted(zone))
-		zone_set_flag(zone, ZONE_FAIR_DEPLETED);
 
 	__count_zone_vm_events(PGALLOC, zone, 1 << order);
 	zone_statistics(preferred_zone, zone, gfp_flags);
@@ -1934,7 +1931,6 @@ static void reset_alloc_batches(struct zone *preferred_zone)
 		mod_zone_page_state(zone, NR_ALLOC_BATCH,
 			high_wmark_pages(zone) - low_wmark_pages(zone) -
 			atomic_long_read(&zone->vm_stat[NR_ALLOC_BATCH]));
-		zone_clear_flag(zone, ZONE_FAIR_DEPLETED);
 	} while (zone++ != preferred_zone);
 }
 
@@ -1985,7 +1981,7 @@ zonelist_scan:
 		if (alloc_flags & ALLOC_FAIR) {
 			if (!zone_local(preferred_zone, zone))
 				break;
-			if (zone_is_fair_depleted(zone)) {
+			if (zone_page_state(zone, NR_ALLOC_BATCH) <= 0) {
 				nr_fair_skipped++;
 				continue;
 			}

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	Linux-FSDevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 6/6] mm: page_alloc: Reduce cost of the fair zone allocation policy
Date: Tue, 2 Sep 2014 10:01:16 -0400	[thread overview]
Message-ID: <20140902140116.GD29501@cmpxchg.org> (raw)
In-Reply-To: <53E8B83D.1070004@suse.cz>

On Mon, Aug 11, 2014 at 02:34:05PM +0200, Vlastimil Babka wrote:
> On 08/11/2014 02:12 PM, Mel Gorman wrote:
> >On Fri, Aug 08, 2014 at 05:27:15PM +0200, Vlastimil Babka wrote:
> >>On 07/09/2014 10:13 AM, Mel Gorman wrote:
> >>>--- a/mm/page_alloc.c
> >>>+++ b/mm/page_alloc.c
> >>>@@ -1604,6 +1604,9 @@ again:
> >>>  	}
> >>>
> >>>  	__mod_zone_page_state(zone, NR_ALLOC_BATCH, -(1 << order));
> >>
> >>This can underflow zero, right?
> >>
> >
> >Yes, because of per-cpu accounting drift.
> 
> I meant mainly because of order > 0.
> 
> >>>+	if (zone_page_state(zone, NR_ALLOC_BATCH) == 0 &&
> >>
> >>AFAICS, zone_page_state will correct negative values to zero only for
> >>CONFIG_SMP. Won't this check be broken on !CONFIG_SMP?
> >>
> >
> >On !CONFIG_SMP how can there be per-cpu accounting drift that would make
> >that counter negative?
> 
> Well original code used "if (zone_page_state(zone, NR_ALLOC_BATCH) <= 0)"
> elsewhere, that you are replacing with zone_is_fair_depleted check. I
> assumed it's because it can get negative due to order > 0. I might have not
> looked thoroughly enough but it seems to me there's nothing that would
> prevent it, such as skipping a zone because its remaining batch is lower
> than 1 << order.
> So I think the check should be "<= 0" to be safe.

Any updates on this?

The counter can definitely underflow on !CONFIG_SMP, and then the flag
gets out of sync with the actual batch state.  I'd still prefer just
removing this flag again; it's extra complexity and error prone (case
in point) while the upsides are not even measurable in real life.

---

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 318df7051850..0bd77f730b38 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -534,7 +534,6 @@ typedef enum {
 	ZONE_WRITEBACK,			/* reclaim scanning has recently found
 					 * many pages under writeback
 					 */
-	ZONE_FAIR_DEPLETED,		/* fair zone policy batch depleted */
 } zone_flags_t;
 
 static inline void zone_set_flag(struct zone *zone, zone_flags_t flag)
@@ -572,11 +571,6 @@ static inline int zone_is_reclaim_locked(const struct zone *zone)
 	return test_bit(ZONE_RECLAIM_LOCKED, &zone->flags);
 }
 
-static inline int zone_is_fair_depleted(const struct zone *zone)
-{
-	return test_bit(ZONE_FAIR_DEPLETED, &zone->flags);
-}
-
 static inline int zone_is_oom_locked(const struct zone *zone)
 {
 	return test_bit(ZONE_OOM_LOCKED, &zone->flags);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 18cee0d4c8a2..d913809a328f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1612,9 +1612,6 @@ again:
 	}
 
 	__mod_zone_page_state(zone, NR_ALLOC_BATCH, -(1 << order));
-	if (zone_page_state(zone, NR_ALLOC_BATCH) == 0 &&
-	    !zone_is_fair_depleted(zone))
-		zone_set_flag(zone, ZONE_FAIR_DEPLETED);
 
 	__count_zone_vm_events(PGALLOC, zone, 1 << order);
 	zone_statistics(preferred_zone, zone, gfp_flags);
@@ -1934,7 +1931,6 @@ static void reset_alloc_batches(struct zone *preferred_zone)
 		mod_zone_page_state(zone, NR_ALLOC_BATCH,
 			high_wmark_pages(zone) - low_wmark_pages(zone) -
 			atomic_long_read(&zone->vm_stat[NR_ALLOC_BATCH]));
-		zone_clear_flag(zone, ZONE_FAIR_DEPLETED);
 	} while (zone++ != preferred_zone);
 }
 
@@ -1985,7 +1981,7 @@ zonelist_scan:
 		if (alloc_flags & ALLOC_FAIR) {
 			if (!zone_local(preferred_zone, zone))
 				break;
-			if (zone_is_fair_depleted(zone)) {
+			if (zone_page_state(zone, NR_ALLOC_BATCH) <= 0) {
 				nr_fair_skipped++;
 				continue;
 			}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-09-02 14:01 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-09  8:13 [PATCH 0/5] Reduce sequential read overhead Mel Gorman
2014-07-09  8:13 ` Mel Gorman
2014-07-09  8:13 ` [PATCH 1/6] mm: pagemap: Avoid unnecessary overhead when tracepoints are deactivated Mel Gorman
2014-07-09  8:13   ` Mel Gorman
2014-07-10 12:01   ` Johannes Weiner
2014-07-10 12:01     ` Johannes Weiner
2014-07-09  8:13 ` [PATCH 2/6] mm: Rearrange zone fields into read-only, page alloc, statistics and page reclaim lines Mel Gorman
2014-07-09  8:13   ` Mel Gorman
2014-07-10 12:06   ` Johannes Weiner
2014-07-10 12:06     ` Johannes Weiner
2014-07-09  8:13 ` [PATCH 3/6] mm: Move zone->pages_scanned into a vmstat counter Mel Gorman
2014-07-09  8:13   ` Mel Gorman
2014-07-10 12:08   ` Johannes Weiner
2014-07-10 12:08     ` Johannes Weiner
2014-07-09  8:13 ` [PATCH 4/6] mm: vmscan: Only update per-cpu thresholds for online CPU Mel Gorman
2014-07-09  8:13   ` Mel Gorman
2014-07-10 12:09   ` Johannes Weiner
2014-07-10 12:09     ` Johannes Weiner
2014-07-09  8:13 ` [PATCH 5/6] mm: page_alloc: Abort fair zone allocation policy when remotes nodes are encountered Mel Gorman
2014-07-09  8:13   ` Mel Gorman
2014-07-10 12:14   ` Johannes Weiner
2014-07-10 12:14     ` Johannes Weiner
2014-07-10 12:44     ` Mel Gorman
2014-07-10 12:44       ` Mel Gorman
2014-07-09  8:13 ` [PATCH 6/6] mm: page_alloc: Reduce cost of the fair zone allocation policy Mel Gorman
2014-07-09  8:13   ` Mel Gorman
2014-07-10 12:18   ` Johannes Weiner
2014-07-10 12:18     ` Johannes Weiner
2014-08-08 15:27   ` Vlastimil Babka
2014-08-08 15:27     ` Vlastimil Babka
2014-08-11 12:12     ` Mel Gorman
2014-08-11 12:12       ` Mel Gorman
2014-08-11 12:34       ` Vlastimil Babka
2014-08-11 12:34         ` Vlastimil Babka
2014-09-02 14:01         ` Johannes Weiner [this message]
2014-09-02 14:01           ` Johannes Weiner
2014-09-05 10:14           ` [PATCH] mm: page_alloc: Fix setting of ZONE_FAIR_DEPLETED on UP Mel Gorman
2014-09-05 10:14             ` Mel Gorman
2014-09-07  6:32             ` Leon Romanovsky
2014-09-07  6:32               ` Leon Romanovsky
2014-09-08 11:57               ` [PATCH] mm: page_alloc: Fix setting of ZONE_FAIR_DEPLETED on UP v2 Mel Gorman
2014-09-08 11:57                 ` Mel Gorman
2014-09-09  8:17                 ` Leon Romanovsky
2014-09-09 19:53                 ` Andrew Morton
2014-09-09 19:53                   ` Andrew Morton
2014-09-10  9:16                   ` Mel Gorman
2014-09-10  9:16                     ` Mel Gorman
2014-09-10 20:32                     ` Johannes Weiner
2014-09-10 20:32                       ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140902140116.GD29501@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.