All of lore.kernel.org
 help / color / mirror / Atom feed
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org, lsf-pc@lists.linux-foundation.org,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Greg Thelen <gthelen@google.com>, Ying Han <yinghan@google.com>,
	Michel Lespinasse <walken@google.com>
Subject: Re: [LSF/MM TOPIC] memory control groups
Date: Tue, 18 Jan 2011 10:10:57 +0900	[thread overview]
Message-ID: <20110118101057.51d20ed7.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <20110117191359.GI2212@cmpxchg.org>

On Mon, 17 Jan 2011 20:14:00 +0100
Johannes Weiner <hannes@cmpxchg.org> wrote:

> Hello,
> 
> on the MM summit, I would like to talk about the current state of
> memory control groups, the features and extensions that are currently
> being developed for it, and what their status is.
> 
> I am especially interested in talking about the current runtime memory
> overhead memcg comes with (1% of ram) and what we can do to shrink it.
> 
> In comparison to how efficiently struct page is packed, and given that
> distro kernels come with memcg enabled per default, I think we should
> put a bit more thought into how struct page_cgroup (which exists for
> every page in the system as well) is organized.
> 
> I have a patch series that removes the page backpointer from struct
> page_cgroup by storing a node ID (or section ID, depending on whether
> sparsemem is configured) in the free bits of pc->flags.
> 
> I also plan on replacing the pc->mem_cgroup pointer with an ID
> (KAMEZAWA-san has patches for that), and move it to pc->flags too.
> Every flag not used means doubling the amount of possible control
> groups, so I have patches that get rid of some flags currently
> allocated, including PCG_CACHE, PCG_ACCT_LRU, and PCG_MIGRATION.
> 
> [ I meant to send those out much earlier already, but a bug in the
> migration rework was not responding to my yelling 'Marco', and now my
> changes collide horribly with THP, so it will take another rebase. ]
> 
> The per-memcg dirty accounting work e.g. allocates a bunch of new bits
> in pc->flags and I'd like to hash out if this leaves enough room for
> the structure packing I described, or whether we can come up with a
> different way of tracking state.
> 

I see that there are requests for shrinking page_cgroup. And yes, I think
we should do so. I think there are trade-off between performance v.s.
memory usage. So, could you show the numbers when we discuss it ?

BTW, I think we can...

- PCG_ACCT_LRU bit can be dropped.(I think list_empty(&pc->lru) can be used.
                ROOT cgroup will not be problem.)
- pc->mem_cgroup can be replaced with ID.
  But move it into flags field seems difficult because of races.
- pc->page can be replaced with some lookup routine.
  But Section bit encoding may be something mysterious and look up cost
  will be problem.
- PCG_CACHE bit is a duplicate of information of 'page'. So, we can use PageAnon()
- I'm not sure PCG_MIGRATION. It's for avoiding races.

Note: we'll need to use 16bits for blkio tracking.

Another idea is dynamic allocation of page_cgroup. It may be able to be a help
for THP enviroment but will not work well (just adds overhead) against file cache
workload.

Anwyay, my priority of development for memcg this year is:

 1. dirty ratio support.
 2. Backgound reclaim (kswapd)
 3. blkio tracking.

Diet of page_cgroup should be done in step by step. We've seen many level down
when some new feature comes to memory cgroup. 

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-01-18  1:16 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-17 19:14 [LSF/MM TOPIC] memory control groups Johannes Weiner
2011-01-18  1:10 ` KAMEZAWA Hiroyuki [this message]
2011-01-18  8:40   ` Johannes Weiner
2011-01-18  9:17     ` KAMEZAWA Hiroyuki
2011-01-18 10:20       ` Johannes Weiner
2011-01-19  0:14         ` KAMEZAWA Hiroyuki
2011-01-18  8:17 ` Michel Lespinasse
2011-01-18  8:45   ` KAMEZAWA Hiroyuki
2011-02-07  5:27     ` Balbir Singh
2011-01-18  8:53 ` CAI Qian
2011-01-20 10:18 ` Balbir Singh
2011-02-06 15:45 ` Michel Lespinasse
2011-02-07  5:26   ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110118101057.51d20ed7.kamezawa.hiroyu@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=walken@google.com \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.