[PATCH V2 5/5] memcg: change the target nr_to_reclaim for each memcg under kswapd

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH V2 5/5] memcg: change the target nr_to_reclaim for each memcg under kswapd
@ 2012-04-11 22:00 Ying Han
  2012-04-11 23:56 ` Johannes Weiner
  2012-04-15  1:57 ` Hillf Danton
  0 siblings, 2 replies; 8+ messages in thread
From: Ying Han @ 2012-04-11 22:00 UTC (permalink / raw)
  To: Michal Hocko, Johannes Weiner, Mel Gorman, KAMEZAWA Hiroyuki,
	Rik van Riel, Hillf Danton, Hugh Dickins, Dan Magenheimer
  Cc: linux-mm

Under global background reclaim, the sc->nr_to_reclaim is set to
ULONG_MAX. Now we are iterating all memcgs under the zone and we
shouldn't pass the pressure from kswapd for each memcg.

After all, the balance_pgdat() breaks after reclaiming SWAP_CLUSTER_MAX
pages to prevent building up reclaim priorities.

Signed-off-by: Ying Han <yinghan@google.com>
---
 mm/vmscan.c |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index d65eae4..ca70ec6 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2083,9 +2083,18 @@ static void shrink_mem_cgroup_zone(int priority, struct mem_cgroup_zone *mz,
 	unsigned long nr_to_scan;
 	enum lru_list lru;
 	unsigned long nr_reclaimed, nr_scanned;
-	unsigned long nr_to_reclaim = sc->nr_to_reclaim;
+	unsigned long nr_to_reclaim;
 	struct blk_plug plug;
 
+	/*
+	 * Under global background reclaim, the sc->nr_to_reclaim is set to
+	 * ULONG_MAX. Now we are iterating all memcgs under the zone and we
+	 * shouldn't pass the pressure from kswapd for each memcg. After all,
+	 * the balance_pgdat() breaks after reclaiming SWAP_CLUSTER_MAX pages
+	 * to prevent building up reclaim priorities.
+	 */
+	nr_to_reclaim = min_t(unsigned long,
+			      sc->nr_to_reclaim, SWAP_CLUSTER_MAX);
 restart:
 	nr_reclaimed = 0;
 	nr_scanned = sc->nr_scanned;
@@ -2755,7 +2764,6 @@ loop_again:
 					high_wmark_pages(zone) + balance_gap,
 					end_zone, 0)) {
 				shrink_zone(priority, zone, &sc);
-
 				reclaim_state->reclaimed_slab = 0;
 				nr_slab = shrink_slab(&shrink, sc.nr_scanned, lru_pages);
 				sc.nr_reclaimed += reclaim_state->reclaimed_slab;
-- 
1.7.7.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 5/5] memcg: change the target nr_to_reclaim for each memcg under kswapd
  2012-04-11 22:00 [PATCH V2 5/5] memcg: change the target nr_to_reclaim for each memcg under kswapd Ying Han
@ 2012-04-11 23:56 ` Johannes Weiner
  2012-04-12  4:06   ` Ying Han
  2012-04-15  1:57 ` Hillf Danton
  1 sibling, 1 reply; 8+ messages in thread
From: Johannes Weiner @ 2012-04-11 23:56 UTC (permalink / raw)
  To: Ying Han
  Cc: Michal Hocko, Mel Gorman, KAMEZAWA Hiroyuki, Rik van Riel,
	Hillf Danton, Hugh Dickins, Dan Magenheimer, linux-mm

On Wed, Apr 11, 2012 at 03:00:27PM -0700, Ying Han wrote:
> Under global background reclaim, the sc->nr_to_reclaim is set to
> ULONG_MAX. Now we are iterating all memcgs under the zone and we
> shouldn't pass the pressure from kswapd for each memcg.
> 
> After all, the balance_pgdat() breaks after reclaiming SWAP_CLUSTER_MAX
> pages to prevent building up reclaim priorities.

shrink_mem_cgroup_zone() bails out of a zone, balance_pgdat() bails
out of a priority loop, there is quite a difference.

After this patch, kswapd no longer puts equal pressure on all zones in
the zonelist, which was a key reason why we could justify bailing
early out of individual zones in direct reclaim: kswapd will restore
fairness.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 5/5] memcg: change the target nr_to_reclaim for each memcg under kswapd
  2012-04-11 23:56 ` Johannes Weiner
@ 2012-04-12  4:06   ` Ying Han
  2012-04-12 14:24     ` Johannes Weiner
  0 siblings, 1 reply; 8+ messages in thread
From: Ying Han @ 2012-04-12  4:06 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Michal Hocko, Mel Gorman, KAMEZAWA Hiroyuki, Rik van Riel,
	Hillf Danton, Hugh Dickins, Dan Magenheimer, linux-mm

On Wed, Apr 11, 2012 at 4:56 PM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> On Wed, Apr 11, 2012 at 03:00:27PM -0700, Ying Han wrote:
>> Under global background reclaim, the sc->nr_to_reclaim is set to
>> ULONG_MAX. Now we are iterating all memcgs under the zone and we
>> shouldn't pass the pressure from kswapd for each memcg.
>>
>> After all, the balance_pgdat() breaks after reclaiming SWAP_CLUSTER_MAX
>> pages to prevent building up reclaim priorities.
>
> shrink_mem_cgroup_zone() bails out of a zone, balance_pgdat() bails
> out of a priority loop, there is quite a difference.
>
> After this patch, kswapd no longer puts equal pressure on all zones in
> the zonelist, which was a key reason why we could justify bailing
> early out of individual zones in direct reclaim: kswapd will restore
> fairness.

Guess I see your point here.

My intention is to prevent over-reclaim memcgs per-zone by having
nr_to_reclaim to ULONG_MAX. Now, we scan each memcg based on
get_scan_count() without bailout, do you see a problem w/o this patch?

--Ying

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 5/5] memcg: change the target nr_to_reclaim for each memcg under kswapd
  2012-04-12  4:06   ` Ying Han
@ 2012-04-12 14:24     ` Johannes Weiner
  2012-04-12 16:45       ` Ying Han
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Weiner @ 2012-04-12 14:24 UTC (permalink / raw)
  To: Ying Han
  Cc: Michal Hocko, Mel Gorman, KAMEZAWA Hiroyuki, Rik van Riel,
	Hillf Danton, Hugh Dickins, Dan Magenheimer, linux-mm

On Wed, Apr 11, 2012 at 09:06:27PM -0700, Ying Han wrote:
> On Wed, Apr 11, 2012 at 4:56 PM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> > On Wed, Apr 11, 2012 at 03:00:27PM -0700, Ying Han wrote:
> >> Under global background reclaim, the sc->nr_to_reclaim is set to
> >> ULONG_MAX. Now we are iterating all memcgs under the zone and we
> >> shouldn't pass the pressure from kswapd for each memcg.
> >>
> >> After all, the balance_pgdat() breaks after reclaiming SWAP_CLUSTER_MAX
> >> pages to prevent building up reclaim priorities.
> >
> > shrink_mem_cgroup_zone() bails out of a zone, balance_pgdat() bails
> > out of a priority loop, there is quite a difference.
> >
> > After this patch, kswapd no longer puts equal pressure on all zones in
> > the zonelist, which was a key reason why we could justify bailing
> > early out of individual zones in direct reclaim: kswapd will restore
> > fairness.
> 
> Guess I see your point here.
> 
> My intention is to prevent over-reclaim memcgs per-zone by having
> nr_to_reclaim to ULONG_MAX. Now, we scan each memcg based on
> get_scan_count() without bailout, do you see a problem w/o this patch?

The fact that we iterate over each memcg does not make a difference,
because the target that get_scan_count() returns for each zone-memcg
is in sum what it would have returned for the whole zone, so the scan
aggressiveness did not increase.  It just distributes the zone's scan
target over the set of memcgs proportional to their share of pages in
that zone.

So I have trouble deciding what's right.

On the one hand, I don't see why you bother with this patch, because
you don't increase the risk of overreclaim.  Michal's concern for
overreclaim came from the fact that I had kswapd do soft limit reclaim
at priority 0 without ever bailing from individual zones.  But your
soft limit implementation is purely about selecting memcgs to reclaim,
you never increase the pressure put on a memcg anywhere.

On the other hand, I don't even agree with that aspect of your series;
that you no longer prioritize explicitely soft-limited groups in
excess over unconfigured groups, as I mentioned in the other mail.
But if you did, you would likely need a patch like this, I think.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 5/5] memcg: change the target nr_to_reclaim for each memcg under kswapd
  2012-04-12 14:24     ` Johannes Weiner
@ 2012-04-12 16:45       ` Ying Han
  2012-04-12 17:44         ` Johannes Weiner
  0 siblings, 1 reply; 8+ messages in thread
From: Ying Han @ 2012-04-12 16:45 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Michal Hocko, Mel Gorman, KAMEZAWA Hiroyuki, Rik van Riel,
	Hillf Danton, Hugh Dickins, Dan Magenheimer, linux-mm

On Thu, Apr 12, 2012 at 7:24 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> On Wed, Apr 11, 2012 at 09:06:27PM -0700, Ying Han wrote:
>> On Wed, Apr 11, 2012 at 4:56 PM, Johannes Weiner <hannes@cmpxchg.org> wrote:
>> > On Wed, Apr 11, 2012 at 03:00:27PM -0700, Ying Han wrote:
>> >> Under global background reclaim, the sc->nr_to_reclaim is set to
>> >> ULONG_MAX. Now we are iterating all memcgs under the zone and we
>> >> shouldn't pass the pressure from kswapd for each memcg.
>> >>
>> >> After all, the balance_pgdat() breaks after reclaiming SWAP_CLUSTER_MAX
>> >> pages to prevent building up reclaim priorities.
>> >
>> > shrink_mem_cgroup_zone() bails out of a zone, balance_pgdat() bails
>> > out of a priority loop, there is quite a difference.
>> >
>> > After this patch, kswapd no longer puts equal pressure on all zones in
>> > the zonelist, which was a key reason why we could justify bailing
>> > early out of individual zones in direct reclaim: kswapd will restore
>> > fairness.
>>
>> Guess I see your point here.
>>
>> My intention is to prevent over-reclaim memcgs per-zone by having
>> nr_to_reclaim to ULONG_MAX. Now, we scan each memcg based on
>> get_scan_count() without bailout, do you see a problem w/o this patch?
>
> The fact that we iterate over each memcg does not make a difference,
> because the target that get_scan_count() returns for each zone-memcg
> is in sum what it would have returned for the whole zone, so the scan
> aggressiveness did not increase.  It just distributes the zone's scan
> target over the set of memcgs proportional to their share of pages in
> that zone.
>
> So I have trouble deciding what's right.
>
> On the one hand, I don't see why you bother with this patch, because
> you don't increase the risk of overreclaim.  Michal's concern for
> overreclaim came from the fact that I had kswapd do soft limit reclaim
> at priority 0 without ever bailing from individual zones.  But your
> soft limit implementation is purely about selecting memcgs to reclaim,
> you never increase the pressure put on a memcg anywhere.

I agree w/ you here.

>
> On the other hand, I don't even agree with that aspect of your series;
> that you no longer prioritize explicitely soft-limited groups in
> excess over unconfigured groups, as I mentioned in the other mail.
> But if you did, you would likely need a patch like this, I think.

Prioritize between memcg w/ default softlimit (0) and memcg exceeds
non-default softlimit (x) ?

Are you referring to the balance the reclaim between eligible memcgs
based on different factors like softlimit_exceed, recent_scanned,
recent_reclaimed....? If so, I am planning to make that as second step
after this patch series.

--Ying

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 5/5] memcg: change the target nr_to_reclaim for each memcg under kswapd
  2012-04-12 16:45       ` Ying Han
@ 2012-04-12 17:44         ` Johannes Weiner
  2012-04-12 17:58           ` Ying Han
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Weiner @ 2012-04-12 17:44 UTC (permalink / raw)
  To: Ying Han
  Cc: Michal Hocko, Mel Gorman, KAMEZAWA Hiroyuki, Rik van Riel,
	Hillf Danton, Hugh Dickins, Dan Magenheimer, linux-mm

On Thu, Apr 12, 2012 at 09:45:47AM -0700, Ying Han wrote:
> On Thu, Apr 12, 2012 at 7:24 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> > On Wed, Apr 11, 2012 at 09:06:27PM -0700, Ying Han wrote:
> >> On Wed, Apr 11, 2012 at 4:56 PM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> >> > On Wed, Apr 11, 2012 at 03:00:27PM -0700, Ying Han wrote:
> >> >> Under global background reclaim, the sc->nr_to_reclaim is set to
> >> >> ULONG_MAX. Now we are iterating all memcgs under the zone and we
> >> >> shouldn't pass the pressure from kswapd for each memcg.
> >> >>
> >> >> After all, the balance_pgdat() breaks after reclaiming SWAP_CLUSTER_MAX
> >> >> pages to prevent building up reclaim priorities.
> >> >
> >> > shrink_mem_cgroup_zone() bails out of a zone, balance_pgdat() bails
> >> > out of a priority loop, there is quite a difference.
> >> >
> >> > After this patch, kswapd no longer puts equal pressure on all zones in
> >> > the zonelist, which was a key reason why we could justify bailing
> >> > early out of individual zones in direct reclaim: kswapd will restore
> >> > fairness.
> >>
> >> Guess I see your point here.
> >>
> >> My intention is to prevent over-reclaim memcgs per-zone by having
> >> nr_to_reclaim to ULONG_MAX. Now, we scan each memcg based on
> >> get_scan_count() without bailout, do you see a problem w/o this patch?
> >
> > The fact that we iterate over each memcg does not make a difference,
> > because the target that get_scan_count() returns for each zone-memcg
> > is in sum what it would have returned for the whole zone, so the scan
> > aggressiveness did not increase.  It just distributes the zone's scan
> > target over the set of memcgs proportional to their share of pages in
> > that zone.
> >
> > So I have trouble deciding what's right.
> >
> > On the one hand, I don't see why you bother with this patch, because
> > you don't increase the risk of overreclaim.  Michal's concern for
> > overreclaim came from the fact that I had kswapd do soft limit reclaim
> > at priority 0 without ever bailing from individual zones.  But your
> > soft limit implementation is purely about selecting memcgs to reclaim,
> > you never increase the pressure put on a memcg anywhere.
> 
> I agree w/ you here.
> 
> >
> > On the other hand, I don't even agree with that aspect of your series;
> > that you no longer prioritize explicitely soft-limited groups in
> > excess over unconfigured groups, as I mentioned in the other mail.
> > But if you did, you would likely need a patch like this, I think.
> 
> Prioritize between memcg w/ default softlimit (0) and memcg exceeds
> non-default softlimit (x) ?

Yup:

	A ( soft = default, usage = 10 )
	B ( soft =       8, usage = 10 )

This is the "memory-nice this one workload" I was referring to in the
other mail.  It would have reclaimed B more aggressively than A in the
past.  After your patch, they will both be reclaimed equally, because
you change the default from "below softlimit" to "above soft limit".

> Are you referring to the balance the reclaim between eligible memcgs
> based on different factors like softlimit_exceed, recent_scanned,
> recent_reclaimed....? If so, I am planning to make that as second step
> after this patch series.

Well, humm.  You potentially break existing setups.  It would be good
not to do that, even just temporarily.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 5/5] memcg: change the target nr_to_reclaim for each memcg under kswapd
  2012-04-12 17:44         ` Johannes Weiner
@ 2012-04-12 17:58           ` Ying Han
  0 siblings, 0 replies; 8+ messages in thread
From: Ying Han @ 2012-04-12 17:58 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Michal Hocko, Mel Gorman, KAMEZAWA Hiroyuki, Rik van Riel,
	Hillf Danton, Hugh Dickins, Dan Magenheimer, linux-mm

On Thu, Apr 12, 2012 at 10:44 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> On Thu, Apr 12, 2012 at 09:45:47AM -0700, Ying Han wrote:
>> On Thu, Apr 12, 2012 at 7:24 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
>> > On Wed, Apr 11, 2012 at 09:06:27PM -0700, Ying Han wrote:
>> >> On Wed, Apr 11, 2012 at 4:56 PM, Johannes Weiner <hannes@cmpxchg.org> wrote:
>> >> > On Wed, Apr 11, 2012 at 03:00:27PM -0700, Ying Han wrote:
>> >> >> Under global background reclaim, the sc->nr_to_reclaim is set to
>> >> >> ULONG_MAX. Now we are iterating all memcgs under the zone and we
>> >> >> shouldn't pass the pressure from kswapd for each memcg.
>> >> >>
>> >> >> After all, the balance_pgdat() breaks after reclaiming SWAP_CLUSTER_MAX
>> >> >> pages to prevent building up reclaim priorities.
>> >> >
>> >> > shrink_mem_cgroup_zone() bails out of a zone, balance_pgdat() bails
>> >> > out of a priority loop, there is quite a difference.
>> >> >
>> >> > After this patch, kswapd no longer puts equal pressure on all zones in
>> >> > the zonelist, which was a key reason why we could justify bailing
>> >> > early out of individual zones in direct reclaim: kswapd will restore
>> >> > fairness.
>> >>
>> >> Guess I see your point here.
>> >>
>> >> My intention is to prevent over-reclaim memcgs per-zone by having
>> >> nr_to_reclaim to ULONG_MAX. Now, we scan each memcg based on
>> >> get_scan_count() without bailout, do you see a problem w/o this patch?
>> >
>> > The fact that we iterate over each memcg does not make a difference,
>> > because the target that get_scan_count() returns for each zone-memcg
>> > is in sum what it would have returned for the whole zone, so the scan
>> > aggressiveness did not increase.  It just distributes the zone's scan
>> > target over the set of memcgs proportional to their share of pages in
>> > that zone.
>> >
>> > So I have trouble deciding what's right.
>> >
>> > On the one hand, I don't see why you bother with this patch, because
>> > you don't increase the risk of overreclaim.  Michal's concern for
>> > overreclaim came from the fact that I had kswapd do soft limit reclaim
>> > at priority 0 without ever bailing from individual zones.  But your
>> > soft limit implementation is purely about selecting memcgs to reclaim,
>> > you never increase the pressure put on a memcg anywhere.
>>
>> I agree w/ you here.
>>
>> >
>> > On the other hand, I don't even agree with that aspect of your series;
>> > that you no longer prioritize explicitely soft-limited groups in
>> > excess over unconfigured groups, as I mentioned in the other mail.
>> > But if you did, you would likely need a patch like this, I think.
>>
>> Prioritize between memcg w/ default softlimit (0) and memcg exceeds
>> non-default softlimit (x) ?
>
> Yup:
>
>        A ( soft = default, usage = 10 )
>        B ( soft =       8, usage = 10 )
>
> This is the "memory-nice this one workload" I was referring to in the
> other mail.  It would have reclaimed B more aggressively than A in the
> past.

What's the default value here? If the default is ULONG_MAX, then it
won't reclaim from A at all. I assume you meant the new default value
"0", then today it will only pick A to reclaim due to the sorted
rb-tree.

After your patch, they will both be reclaimed equally, because
> you change the default from "below softlimit" to "above soft limit".

Yes, the patch changed the flipped the "default" which makes A to be
reclaimable.

>
>> Are you referring to the balance the reclaim between eligible memcgs
>> based on different factors like softlimit_exceed, recent_scanned,
>> recent_reclaimed....? If so, I am planning to make that as second step
>> after this patch series.
>
> Well, humm.  You potentially break existing setups.  It would be good
> not to do that, even just temporarily.

I agree this is different than the past, but I am not sure how hard we
need to try to preserve the old behavior by given :
1. the old implementation is sub-optimal and needs to be changed
2. very few customer uses the soft_limit at this moment, and we get
better chance to change it now than later.

By saying that, I can try to make the current patch works like
"picking memcg exceeds the most", just like what it is now. But my
focus is more on cleaning up the existing code and the further
optimization can be easily apply.

--Ying

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 5/5] memcg: change the target nr_to_reclaim for each memcg under kswapd
  2012-04-11 22:00 [PATCH V2 5/5] memcg: change the target nr_to_reclaim for each memcg under kswapd Ying Han
  2012-04-11 23:56 ` Johannes Weiner
@ 2012-04-15  1:57 ` Hillf Danton
  1 sibling, 0 replies; 8+ messages in thread
From: Hillf Danton @ 2012-04-15  1:57 UTC (permalink / raw)
  To: Ying Han
  Cc: Michal Hocko, Johannes Weiner, Mel Gorman, KAMEZAWA Hiroyuki,
	Rik van Riel, Hugh Dickins, Dan Magenheimer, linux-mm

On Thu, Apr 12, 2012 at 6:00 AM, Ying Han <yinghan@google.com> wrote:
> Under global background reclaim, the sc->nr_to_reclaim is set to
> ULONG_MAX. Now we are iterating all memcgs under the zone and we
> shouldn't pass the pressure from kswapd for each memcg.
>
> After all, the balance_pgdat() breaks after reclaiming SWAP_CLUSTER_MAX
> pages to prevent building up reclaim priorities.
>
> Signed-off-by: Ying Han <yinghan@google.com>
> ---
>  mm/vmscan.c |   12 ++++++++++--
>  1 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index d65eae4..ca70ec6 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2083,9 +2083,18 @@ static void shrink_mem_cgroup_zone(int priority, struct mem_cgroup_zone *mz,
>        unsigned long nr_to_scan;
>        enum lru_list lru;
>        unsigned long nr_reclaimed, nr_scanned;
> -       unsigned long nr_to_reclaim = sc->nr_to_reclaim;
> +       unsigned long nr_to_reclaim;
>        struct blk_plug plug;
>
> +       /*
> +        * Under global background reclaim, the sc->nr_to_reclaim is set to
> +        * ULONG_MAX. Now we are iterating all memcgs under the zone and we
> +        * shouldn't pass the pressure from kswapd for each memcg. After all,
> +        * the balance_pgdat() breaks after reclaiming SWAP_CLUSTER_MAX pages
> +        * to prevent building up reclaim priorities.
> +        */
> +       nr_to_reclaim = min_t(unsigned long,
> +                             sc->nr_to_reclaim, SWAP_CLUSTER_MAX);
>  restart:
>        nr_reclaimed = 0;
>        nr_scanned = sc->nr_scanned;
>
Since priority is one of the factors used in computing scan count, we could
change how to select a memcg for reclaim,

	return target_mem_cgroup ||
		mem_cgroup_soft_limit_exceeded(memcg) ||
		priority != DEF_PRIORITY;

where detection of all mem groups under softlimit happens at DEF_PRIORITY-1.

Then selected mem groups are reclaimed in the current manner without change
in nr_to_reclaim, and sc->nr_reclaimed is distributed evenly among mem groups,
no matter softlimt is exceeded or not.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-04-15  1:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-11 22:00 [PATCH V2 5/5] memcg: change the target nr_to_reclaim for each memcg under kswapd Ying Han
2012-04-11 23:56 ` Johannes Weiner
2012-04-12  4:06   ` Ying Han
2012-04-12 14:24     ` Johannes Weiner
2012-04-12 16:45       ` Ying Han
2012-04-12 17:44         ` Johannes Weiner
2012-04-12 17:58           ` Ying Han
2012-04-15  1:57 ` Hillf Danton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.