All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ying Han <yinghan@google.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Pavel Emelyanov <xemul@openvz.org>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	Li Zefan <lizf@cn.fujitsu.com>, Mel Gorman <mel@csn.ul.ie>,
	Christoph Lameter <cl@linux.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Rik van Riel <riel@redhat.com>, Hugh Dickins <hughd@google.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Tejun Heo <tj@kernel.org>, Michal Hocko <mhocko@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	linux-mm@kvack.org
Subject: Re: [PATCH V3 5/7] Per-memcg background reclaim.
Date: Wed, 13 Apr 2011 15:45:14 -0700	[thread overview]
Message-ID: <BANLkTikMEYTkRq5Bq2JZh0zibQw66pDzkQ@mail.gmail.com> (raw)
In-Reply-To: <20110413175842.36938786.kamezawa.hiroyu@jp.fujitsu.com>

[-- Attachment #1: Type: text/plain, Size: 23131 bytes --]

On Wed, Apr 13, 2011 at 1:58 AM, KAMEZAWA Hiroyuki <
kamezawa.hiroyu@jp.fujitsu.com> wrote:

> On Wed, 13 Apr 2011 00:03:05 -0700
> Ying Han <yinghan@google.com> wrote:
>
> > This is the main loop of per-memcg background reclaim which is
> implemented in
> > function balance_mem_cgroup_pgdat().
> >
> > The function performs a priority loop similar to global reclaim. During
> each
> > iteration it invokes balance_pgdat_node() for all nodes on the system,
> which
> > is another new function performs background reclaim per node. A fairness
> > mechanism is implemented to remember the last node it was reclaiming from
> and
> > always start at the next one. After reclaiming each node, it checks
> > mem_cgroup_watermark_ok() and breaks the priority loop if it returns
> true. The
> > per-memcg zone will be marked as "unreclaimable" if the scanning rate is
> much
> > greater than the reclaiming rate on the per-memcg LRU. The bit is cleared
> when
> > there is a page charged to the memcg being freed. Kswapd breaks the
> priority
> > loop if all the zones are marked as "unreclaimable".
> >
>
> Hmm, bigger than expected. I'm glad if you can divide this into small
> pieces.
> see below.
>
>
> > changelog v3..v2:
> > 1. change mz->all_unreclaimable to be boolean.
> > 2. define ZONE_RECLAIMABLE_RATE macro shared by zone and per-memcg
> reclaim.
> > 3. some more clean-up.
> >
> > changelog v2..v1:
> > 1. move the per-memcg per-zone clear_unreclaimable into uncharge stage.
> > 2. shared the kswapd_run/kswapd_stop for per-memcg and global background
> > reclaim.
> > 3. name the per-memcg memcg as "memcg-id" (css->id). And the global
> kswapd
> > keeps the same name.
> > 4. fix a race on kswapd_stop while the per-memcg-per-zone info could be
> accessed
> > after freeing.
> > 5. add the fairness in zonelist where memcg remember the last zone
> reclaimed
> > from.
> >
> > Signed-off-by: Ying Han <yinghan@google.com>
> > ---
> >  include/linux/memcontrol.h |   33 +++++++
> >  include/linux/swap.h       |    2 +
> >  mm/memcontrol.c            |  136 +++++++++++++++++++++++++++++
> >  mm/vmscan.c                |  208
> ++++++++++++++++++++++++++++++++++++++++++++
> >  4 files changed, 379 insertions(+), 0 deletions(-)
> >
> > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > index f7ffd1f..a8159f5 100644
> > --- a/include/linux/memcontrol.h
> > +++ b/include/linux/memcontrol.h
> > @@ -88,6 +88,9 @@ extern int mem_cgroup_init_kswapd(struct mem_cgroup
> *mem,
> >                                 struct kswapd *kswapd_p);
> >  extern void mem_cgroup_clear_kswapd(struct mem_cgroup *mem);
> >  extern wait_queue_head_t *mem_cgroup_kswapd_wait(struct mem_cgroup
> *mem);
> > +extern int mem_cgroup_last_scanned_node(struct mem_cgroup *mem);
> > +extern int mem_cgroup_select_victim_node(struct mem_cgroup *mem,
> > +                                     const nodemask_t *nodes);
> >
> >  static inline
> >  int mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup
> *cgroup)
> > @@ -152,6 +155,12 @@ static inline void mem_cgroup_dec_page_stat(struct
> page *page,
> >  unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int
> order,
> >                                               gfp_t gfp_mask);
> >  u64 mem_cgroup_get_limit(struct mem_cgroup *mem);
> > +void mem_cgroup_clear_unreclaimable(struct mem_cgroup *mem, struct page
> *page);
> > +bool mem_cgroup_zone_reclaimable(struct mem_cgroup *mem, int nid, int
> zid);
> > +bool mem_cgroup_mz_unreclaimable(struct mem_cgroup *mem, struct zone
> *zone);
> > +void mem_cgroup_mz_set_unreclaimable(struct mem_cgroup *mem, struct zone
> *zone);
> > +void mem_cgroup_mz_pages_scanned(struct mem_cgroup *mem, struct zone*
> zone,
> > +                             unsigned long nr_scanned);
> >
> >  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> >  void mem_cgroup_split_huge_fixup(struct page *head, struct page *tail);
> > @@ -342,6 +351,25 @@ static inline void mem_cgroup_dec_page_stat(struct
> page *page,
> >  {
> >  }
> >
> > +static inline void mem_cgroup_mz_pages_scanned(struct mem_cgroup *mem,
> > +                                             struct zone *zone,
> > +                                             unsigned long nr_scanned)
> > +{
> > +}
> > +
> > +static inline void mem_cgroup_clear_unreclaimable(struct page *page,
> > +                                                     struct zone *zone)
> > +{
> > +}
> > +static inline void mem_cgroup_mz_set_unreclaimable(struct mem_cgroup
> *mem,
> > +             struct zone *zone)
> > +{
> > +}
> > +static inline bool mem_cgroup_mz_unreclaimable(struct mem_cgroup *mem,
> > +                                             struct zone *zone)
> > +{
> > +}
> > +
> >  static inline
> >  unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int
> order,
> >                                           gfp_t gfp_mask)
> > @@ -360,6 +388,11 @@ static inline void
> mem_cgroup_split_huge_fixup(struct page *head,
> >  {
> >  }
> >
> > +static inline bool mem_cgroup_zone_reclaimable(struct mem_cgroup *mem,
> int nid,
> > +                                                             int zid)
> > +{
> > +     return false;
> > +}
> >  #endif /* CONFIG_CGROUP_MEM_CONT */
> >
> >  #if !defined(CONFIG_CGROUP_MEM_RES_CTLR) || !defined(CONFIG_DEBUG_VM)
> > diff --git a/include/linux/swap.h b/include/linux/swap.h
> > index 17e0511..319b800 100644
> > --- a/include/linux/swap.h
> > +++ b/include/linux/swap.h
> > @@ -160,6 +160,8 @@ enum {
> >       SWP_SCANNING    = (1 << 8),     /* refcount in scan_swap_map */
> >  };
> >
> > +#define ZONE_RECLAIMABLE_RATE 6
> > +
> >  #define SWAP_CLUSTER_MAX 32
> >  #define COMPACT_CLUSTER_MAX SWAP_CLUSTER_MAX
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index acd84a8..efeade3 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -133,7 +133,10 @@ struct mem_cgroup_per_zone {
> >       bool                    on_tree;
> >       struct mem_cgroup       *mem;           /* Back pointer, we cannot
> */
> >                                               /* use container_of
>  */
> > +     unsigned long           pages_scanned;  /* since last reclaim */
> > +     bool                    all_unreclaimable;      /* All pages pinned
> */
> >  };
> > +
> >  /* Macro for accessing counter */
> >  #define MEM_CGROUP_ZSTAT(mz, idx)    ((mz)->count[(idx)])
> >
> > @@ -275,6 +278,11 @@ struct mem_cgroup {
> >
> >       int wmark_ratio;
> >
> > +     /* While doing per cgroup background reclaim, we cache the
> > +      * last node we reclaimed from
> > +      */
> > +     int last_scanned_node;
> > +
> >       wait_queue_head_t *kswapd_wait;
> >  };
> >
> > @@ -1129,6 +1137,96 @@ mem_cgroup_get_reclaim_stat_from_page(struct page
> *page)
> >       return &mz->reclaim_stat;
> >  }
> >
> > +static unsigned long mem_cgroup_zone_reclaimable_pages(
> > +                                     struct mem_cgroup_per_zone *mz)
> > +{
> > +     int nr;
> > +     nr = MEM_CGROUP_ZSTAT(mz, LRU_ACTIVE_FILE) +
> > +             MEM_CGROUP_ZSTAT(mz, LRU_INACTIVE_FILE);
> > +
> > +     if (nr_swap_pages > 0)
> > +             nr += MEM_CGROUP_ZSTAT(mz, LRU_ACTIVE_ANON) +
> > +                     MEM_CGROUP_ZSTAT(mz, LRU_INACTIVE_ANON);
> > +
> > +     return nr;
> > +}
> > +
> > +void mem_cgroup_mz_pages_scanned(struct mem_cgroup *mem, struct zone*
> zone,
> > +                                             unsigned long nr_scanned)
> > +{
> > +     struct mem_cgroup_per_zone *mz = NULL;
> > +     int nid = zone_to_nid(zone);
> > +     int zid = zone_idx(zone);
> > +
> > +     if (!mem)
> > +             return;
> > +
> > +     mz = mem_cgroup_zoneinfo(mem, nid, zid);
> > +     if (mz)
> > +             mz->pages_scanned += nr_scanned;
> > +}
> > +
> > +bool mem_cgroup_zone_reclaimable(struct mem_cgroup *mem, int nid, int
> zid)
> > +{
> > +     struct mem_cgroup_per_zone *mz = NULL;
> > +
> > +     if (!mem)
> > +             return 0;
> > +
> > +     mz = mem_cgroup_zoneinfo(mem, nid, zid);
> > +     if (mz)
> > +             return mz->pages_scanned <
> > +                             mem_cgroup_zone_reclaimable_pages(mz) *
> > +                             ZONE_RECLAIMABLE_RATE;
> > +     return 0;
> > +}
> > +
> > +bool mem_cgroup_mz_unreclaimable(struct mem_cgroup *mem, struct zone
> *zone)
> > +{
> > +     struct mem_cgroup_per_zone *mz = NULL;
> > +     int nid = zone_to_nid(zone);
> > +     int zid = zone_idx(zone);
> > +
> > +     if (!mem)
> > +             return false;
> > +
> > +     mz = mem_cgroup_zoneinfo(mem, nid, zid);
> > +     if (mz)
> > +             return mz->all_unreclaimable;
> > +
> > +     return false;
> > +}
> > +
> > +void mem_cgroup_mz_set_unreclaimable(struct mem_cgroup *mem, struct zone
> *zone)
> > +{
> > +     struct mem_cgroup_per_zone *mz = NULL;
> > +     int nid = zone_to_nid(zone);
> > +     int zid = zone_idx(zone);
> > +
> > +     if (!mem)
> > +             return;
> > +
> > +     mz = mem_cgroup_zoneinfo(mem, nid, zid);
> > +     if (mz)
> > +             mz->all_unreclaimable = true;
> > +}
> > +
> > +void mem_cgroup_clear_unreclaimable(struct mem_cgroup *mem, struct page
> *page)
> > +{
> > +     struct mem_cgroup_per_zone *mz = NULL;
> > +
> > +     if (!mem)
> > +             return;
> > +
> > +     mz = page_cgroup_zoneinfo(mem, page);
> > +     if (mz) {
> > +             mz->pages_scanned = 0;
> > +             mz->all_unreclaimable = false;
> > +     }
> > +
> > +     return;
> > +}
> > +
> >  unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan,
> >                                       struct list_head *dst,
> >                                       unsigned long *scanned, int order,
> > @@ -1545,6 +1643,32 @@ static int mem_cgroup_hierarchical_reclaim(struct
> mem_cgroup *root_mem,
> >  }
> >
> >  /*
> > + * Visit the first node after the last_scanned_node of @mem and use that
> to
> > + * reclaim free pages from.
> > + */
> > +int
> > +mem_cgroup_select_victim_node(struct mem_cgroup *mem, const nodemask_t
> *nodes)
> > +{
> > +     int next_nid;
> > +     int last_scanned;
> > +
> > +     last_scanned = mem->last_scanned_node;
> > +
> > +     /* Initial stage and start from node0 */
> > +     if (last_scanned == -1)
> > +             next_nid = 0;
> > +     else
> > +             next_nid = next_node(last_scanned, *nodes);
> > +
> > +     if (next_nid == MAX_NUMNODES)
> > +             next_nid = first_node(*nodes);
> > +
> > +     mem->last_scanned_node = next_nid;
> > +
> > +     return next_nid;
> > +}
> > +
> > +/*
> >   * Check OOM-Killer is already running under our hierarchy.
> >   * If someone is running, return false.
> >   */
> > @@ -2779,6 +2903,7 @@ __mem_cgroup_uncharge_common(struct page *page,
> enum charge_type ctype)
> >        * special functions.
> >        */
> >
> > +     mem_cgroup_clear_unreclaimable(mem, page);
>
> Hmm, do we this always at uncharge ?
>
> I doubt we really need mz->all_unreclaimable ....
>
> Anyway, I'd like to see this all_unreclaimable logic in an independet
> patch.
> Because direct-relcaim pass should see this, too.
>
> So, could you devide this pieces into
>
> 1. record last node .... I wonder this logic should be used in
> direct-reclaim pass, too.
>
> 2. all_unreclaimable .... direct reclaim will be affected, too.
>
> 3. scanning core.
>

Ok. will make the change for the next post.

>
>
>
> >       unlock_page_cgroup(pc);
> >       /*
> >        * even after unlock, we have mem->res.usage here and this memcg
> > @@ -4501,6 +4626,8 @@ static int alloc_mem_cgroup_per_zone_info(struct
> mem_cgroup *mem, int node)
> >               mz->usage_in_excess = 0;
> >               mz->on_tree = false;
> >               mz->mem = mem;
> > +             mz->pages_scanned = 0;
> > +             mz->all_unreclaimable = false;
> >       }
> >       return 0;
> >  }
> > @@ -4651,6 +4778,14 @@ wait_queue_head_t *mem_cgroup_kswapd_wait(struct
> mem_cgroup *mem)
> >       return mem->kswapd_wait;
> >  }
> >
> > +int mem_cgroup_last_scanned_node(struct mem_cgroup *mem)
> > +{
> > +     if (!mem)
> > +             return -1;
> > +
> > +     return mem->last_scanned_node;
> > +}
> > +
> >  static int mem_cgroup_soft_limit_tree_init(void)
> >  {
> >       struct mem_cgroup_tree_per_node *rtpn;
> > @@ -4726,6 +4861,7 @@ mem_cgroup_create(struct cgroup_subsys *ss, struct
> cgroup *cont)
> >               res_counter_init(&mem->memsw, NULL);
> >       }
> >       mem->last_scanned_child = 0;
> > +     mem->last_scanned_node = -1;
> >       INIT_LIST_HEAD(&mem->oom_notify);
> >
> >       if (parent)
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index a1a1211..6571eb8 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -47,6 +47,8 @@
> >
> >  #include <linux/swapops.h>
> >
> > +#include <linux/res_counter.h>
> > +
> >  #include "internal.h"
> >
> >  #define CREATE_TRACE_POINTS
> > @@ -111,6 +113,8 @@ struct scan_control {
> >        * are scanned.
> >        */
> >       nodemask_t      *nodemask;
> > +
> > +     int priority;
> >  };
> >
> >  #define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
> > @@ -1410,6 +1414,9 @@ shrink_inactive_list(unsigned long nr_to_scan,
> struct zone *zone,
> >                                       ISOLATE_BOTH : ISOLATE_INACTIVE,
> >                       zone, sc->mem_cgroup,
> >                       0, file);
> > +
> > +             mem_cgroup_mz_pages_scanned(sc->mem_cgroup, zone,
> nr_scanned);
> > +
> >               /*
> >                * mem_cgroup_isolate_pages() keeps track of
> >                * scanned pages on its own.
> > @@ -1529,6 +1536,7 @@ static void shrink_active_list(unsigned long
> nr_pages, struct zone *zone,
> >                * mem_cgroup_isolate_pages() keeps track of
> >                * scanned pages on its own.
> >                */
> > +             mem_cgroup_mz_pages_scanned(sc->mem_cgroup, zone,
> pgscanned);
> >       }
> >
> >       reclaim_stat->recent_scanned[file] += nr_taken;
> > @@ -2632,11 +2640,211 @@ static void kswapd_try_to_sleep(struct kswapd
> *kswapd_p, int order,
> >       finish_wait(wait_h, &wait);
> >  }
> >
> > +#ifdef CONFIG_CGROUP_MEM_RES_CTLR
> > +/*
> > + * The function is used for per-memcg LRU. It scanns all the zones of
> the
> > + * node and returns the nr_scanned and nr_reclaimed.
> > + */
> > +static void balance_pgdat_node(pg_data_t *pgdat, int order,
> > +                                     struct scan_control *sc)
> > +{
> > +     int i, end_zone;
> > +     unsigned long total_scanned;
> > +     struct mem_cgroup *mem_cont = sc->mem_cgroup;
> > +     int priority = sc->priority;
> > +     int nid = pgdat->node_id;
> > +
> > +     /*
> > +      * Scan in the highmem->dma direction for the highest
> > +      * zone which needs scanning
> > +      */
> > +     for (i = pgdat->nr_zones - 1; i >= 0; i--) {
> > +             struct zone *zone = pgdat->node_zones + i;
> > +
> > +             if (!populated_zone(zone))
> > +                     continue;
> > +
> > +             if (mem_cgroup_mz_unreclaimable(mem_cont, zone) &&
> > +                             priority != DEF_PRIORITY)
> > +                     continue;
> > +             /*
> > +              * Do some background aging of the anon list, to give
> > +              * pages a chance to be referenced before reclaiming.
> > +              */
> > +             if (inactive_anon_is_low(zone, sc))
> > +                     shrink_active_list(SWAP_CLUSTER_MAX, zone,
> > +                                                     sc, priority, 0);
> > +
> > +             end_zone = i;
> > +             goto scan;
> > +     }
>
> I don't want to see zone balancing logic in memcg.
> It should be a work of global lru.
>
> IOW, even if we remove global LRU finally, we should
> implement zone balancing logic in _global_ (per node) kswapd.
> (kswapd can pass zone mask to each memcg.)
>
> If you want some clever logic for memcg specail, I think it should be
> deteciting 'which node should be victim' logic rather than round-robin.
> (But yes, starting from easy round robin makes sense.)
>
> So, could you add more simple one ?
>
>  do {
>    select victim node
>    do reclaim
>  } while (need_stop)
>
> zone balancing should be done other than memcg.
>
> what we really need to improve is 'select victim node'.
>

I will separate out the logic in the next post. So it would be easier to
optimize each individual functionality.

--Ying

>
> Thanks,
> -Kame
>
>
> > +     return;
> > +
> > +scan:
> > +     total_scanned = 0;
> > +     /*
> > +      * Now scan the zone in the dma->highmem direction, stopping
> > +      * at the last zone which needs scanning.
> > +      *
> > +      * We do this because the page allocator works in the opposite
> > +      * direction.  This prevents the page allocator from allocating
> > +      * pages behind kswapd's direction of progress, which would
> > +      * cause too much scanning of the lower zones.
> > +      */
> > +     for (i = 0; i <= end_zone; i++) {
> > +             struct zone *zone = pgdat->node_zones + i;
> > +
> > +             if (!populated_zone(zone))
> > +                     continue;
> > +
> > +             if (mem_cgroup_mz_unreclaimable(mem_cont, zone) &&
> > +                     priority != DEF_PRIORITY)
> > +                     continue;
> > +
> > +             sc->nr_scanned = 0;
> > +             shrink_zone(priority, zone, sc);
> > +             total_scanned += sc->nr_scanned;
> > +
> > +             if (mem_cgroup_mz_unreclaimable(mem_cont, zone))
> > +                     continue;
> > +
> > +             if (!mem_cgroup_zone_reclaimable(mem_cont, nid, i))
> > +                     mem_cgroup_mz_set_unreclaimable(mem_cont, zone);
> > +
> > +             /*
> > +              * If we've done a decent amount of scanning and
> > +              * the reclaim ratio is low, start doing writepage
> > +              * even in laptop mode
> > +              */
> > +             if (total_scanned > SWAP_CLUSTER_MAX * 2 &&
> > +                 total_scanned > sc->nr_reclaimed + sc->nr_reclaimed /
> 2) {
> > +                     sc->may_writepage = 1;
> > +             }
> > +     }
> > +
> > +     sc->nr_scanned = total_scanned;
> > +     return;
> > +}
> > +
> > +/*
> > + * Per cgroup background reclaim.
> > + * TODO: Take off the order since memcg always do order 0
> > + */
> > +static unsigned long balance_mem_cgroup_pgdat(struct mem_cgroup
> *mem_cont,
> > +                                           int order)
> > +{
> > +     int i, nid;
> > +     int start_node;
> > +     int priority;
> > +     bool wmark_ok;
> > +     int loop;
> > +     pg_data_t *pgdat;
> > +     nodemask_t do_nodes;
> > +     unsigned long total_scanned;
> > +     struct scan_control sc = {
> > +             .gfp_mask = GFP_KERNEL,
> > +             .may_unmap = 1,
> > +             .may_swap = 1,
> > +             .nr_to_reclaim = ULONG_MAX,
> > +             .swappiness = vm_swappiness,
> > +             .order = order,
> > +             .mem_cgroup = mem_cont,
> > +     };
> > +
> > +loop_again:
> > +     do_nodes = NODE_MASK_NONE;
> > +     sc.may_writepage = !laptop_mode;
> > +     sc.nr_reclaimed = 0;
> > +     total_scanned = 0;
> > +
> > +     for (priority = DEF_PRIORITY; priority >= 0; priority--) {
> > +             sc.priority = priority;
> > +             wmark_ok = false;
> > +             loop = 0;
> > +
> > +             /* The swap token gets in the way of swapout... */
> > +             if (!priority)
> > +                     disable_swap_token();
> > +
> > +             if (priority == DEF_PRIORITY)
> > +                     do_nodes = node_states[N_ONLINE];
> > +
> > +             while (1) {
> > +                     nid = mem_cgroup_select_victim_node(mem_cont,
> > +                                                     &do_nodes);
> > +
> > +                     /* Indicate we have cycled the nodelist once
> > +                      * TODO: we might add MAX_RECLAIM_LOOP for
> preventing
> > +                      * kswapd burning cpu cycles.
> > +                      */
> > +                     if (loop == 0) {
> > +                             start_node = nid;
> > +                             loop++;
> > +                     } else if (nid == start_node)
> > +                             break;
> > +
> > +                     pgdat = NODE_DATA(nid);
> > +                     balance_pgdat_node(pgdat, order, &sc);
> > +                     total_scanned += sc.nr_scanned;
> > +
> > +                     /* Set the node which has at least
> > +                      * one reclaimable zone
> > +                      */
> > +                     for (i = pgdat->nr_zones - 1; i >= 0; i--) {
> > +                             struct zone *zone = pgdat->node_zones + i;
> > +
> > +                             if (!populated_zone(zone))
> > +                                     continue;
> > +
> > +                             if (!mem_cgroup_mz_unreclaimable(mem_cont,
> > +                                                             zone))
> > +                                     break;
> > +                     }
> > +                     if (i < 0)
> > +                             node_clear(nid, do_nodes);
> > +
> > +                     if (mem_cgroup_watermark_ok(mem_cont,
> > +                                                     CHARGE_WMARK_HIGH))
> {
> > +                             wmark_ok = true;
> > +                             goto out;
> > +                     }
> > +
> > +                     if (nodes_empty(do_nodes)) {
> > +                             wmark_ok = true;
> > +                             goto out;
> > +                     }
> > +             }
> > +
> > +             /* All the nodes are unreclaimable, kswapd is done */
> > +             if (nodes_empty(do_nodes)) {
> > +                     wmark_ok = true;
> > +                     goto out;
> > +             }
> > +
> > +             if (total_scanned && priority < DEF_PRIORITY - 2)
> > +                     congestion_wait(WRITE, HZ/10);
> > +
> > +             if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX)
> > +                     break;
> > +     }
> > +out:
> > +     if (!wmark_ok) {
> > +             cond_resched();
> > +
> > +             try_to_freeze();
> > +
> > +             goto loop_again;
> > +     }
> > +
> > +     return sc.nr_reclaimed;
> > +}
> > +#else
> >  static unsigned long balance_mem_cgroup_pgdat(struct mem_cgroup
> *mem_cont,
> >                                                       int order)
> >  {
> >       return 0;
> >  }
> > +#endif
> >
> >  /*
> >   * The background pageout daemon, started as a kernel thread
> > --
> > 1.7.3.1
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Fight unfair telecom internet charges in Canada: sign
> http://stopthemeter.ca/
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >
>
>

[-- Attachment #2: Type: text/html, Size: 28028 bytes --]

  reply	other threads:[~2011-04-13 22:45 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-13  7:03 [PATCH V3 0/7] memcg: per cgroup background reclaim Ying Han
2011-04-13  7:03 ` [PATCH V3 1/7] Add kswapd descriptor Ying Han
2011-04-13  7:03 ` [PATCH V3 2/7] Add per memcg reclaim watermarks Ying Han
2011-04-13  8:25   ` KAMEZAWA Hiroyuki
2011-04-13 18:40     ` Ying Han
2011-04-14  0:27       ` KAMEZAWA Hiroyuki
2011-04-14  8:24   ` Zhu Yanhai
2011-04-14 17:43     ` Ying Han
2011-04-13  7:03 ` [PATCH V3 3/7] New APIs to adjust per-memcg wmarks Ying Han
2011-04-13  8:30   ` KAMEZAWA Hiroyuki
2011-04-13 18:46     ` Ying Han
2011-04-13  7:03 ` [PATCH V3 4/7] Infrastructure to support per-memcg reclaim Ying Han
2011-04-14  3:57   ` Zhu Yanhai
2011-04-14  6:32     ` Ying Han
2011-04-13  7:03 ` [PATCH V3 5/7] Per-memcg background reclaim Ying Han
2011-04-13  8:58   ` KAMEZAWA Hiroyuki
2011-04-13 22:45     ` Ying Han [this message]
2011-04-13  7:03 ` [PATCH V3 6/7] Enable per-memcg " Ying Han
2011-04-13  9:05   ` KAMEZAWA Hiroyuki
2011-04-13 21:20     ` Ying Han
2011-04-14  0:30       ` KAMEZAWA Hiroyuki
2011-04-13  7:03 ` [PATCH V3 7/7] Add some per-memcg stats Ying Han
2011-04-13  7:47 ` [PATCH V3 0/7] memcg: per cgroup background reclaim KAMEZAWA Hiroyuki
2011-04-13 17:53   ` Ying Han
2011-04-14  0:14     ` KAMEZAWA Hiroyuki
2011-04-14 17:38       ` Ying Han
2011-04-14 21:59         ` Ying Han

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BANLkTikMEYTkRq5Bq2JZh0zibQw66pDzkQ@mail.gmail.com \
    --to=yinghan@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=cl@linux.com \
    --cc=dave@linux.vnet.ibm.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=mel@csn.ul.ie \
    --cc=mhocko@suse.cz \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=riel@redhat.com \
    --cc=tj@kernel.org \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.