All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.cz>
To: Johannes Weiner <jweiner@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	Balbir Singh <bsingharora@gmail.com>,
	Ying Han <yinghan@google.com>, Greg Thelen <gthelen@google.com>,
	Michel Lespinasse <walken@google.com>,
	Rik van Riel <riel@redhat.com>,
	Minchan Kim <minchan.kim@gmail.com>,
	Christoph Hellwig <hch@infradead.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch 10/11] mm: make per-memcg LRU lists exclusive
Date: Wed, 21 Sep 2011 18:05:18 +0200	[thread overview]
Message-ID: <20110921160518.GK8501@tiehlicka.suse.cz> (raw)
In-Reply-To: <20110921154745.GA25828@redhat.com>

On Wed 21-09-11 17:47:45, Johannes Weiner wrote:
> On Wed, Sep 21, 2011 at 05:24:58PM +0200, Michal Hocko wrote:
> > On Mon 12-09-11 12:57:27, Johannes Weiner wrote:
[...]
> > > @@ -934,115 +954,123 @@ EXPORT_SYMBOL(mem_cgroup_count_vm_event);
> > >   * When moving account, the page is not on LRU. It's isolated.
> > >   */
> > >  
> > > -struct page *mem_cgroup_lru_to_page(struct zone *zone, struct mem_cgroup *mem,
> > > -				    enum lru_list lru)
> > > +/**
> > > + * mem_cgroup_lru_add_list - account for adding an lru page and return lruvec
> > > + * @zone: zone of the page
> > > + * @page: the page
> > > + * @lru: current lru
> > > + *
> > > + * This function accounts for @page being added to @lru, and returns
> > > + * the lruvec for the given @zone and the memcg @page is charged to.
> > > + *
> > > + * The callsite is then responsible for physically linking the page to
> > > + * the returned lruvec->lists[@lru].
> > > + */
> > > +struct lruvec *mem_cgroup_lru_add_list(struct zone *zone, struct page *page,
> > > +				       enum lru_list lru)
> > 
> > I know that names are alway tricky but what about mem_cgroup_acct_lru_add?
> > Analogously for mem_cgroup_lru_del_list, mem_cgroup_lru_del and
> > mem_cgroup_lru_move_lists.
> 
> Hmm, but it doesn't just lru-account, it also looks up the right
> lruvec for the caller to link the page to, so it's not necessarily an
> improvement, although I agree that the name could be better.

Sorry, I do not have any better idea. I would just like if the name
didn't suggest that we actually modify the list.

> 
> > > @@ -3615,11 +3593,11 @@ unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
> > >  static int mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
> > >  				int node, int zid, enum lru_list lru)
> > >  {
> > > -	struct zone *zone;
> > >  	struct mem_cgroup_per_zone *mz;
> > > -	struct page_cgroup *pc, *busy;
> > >  	unsigned long flags, loop;
> > >  	struct list_head *list;
> > > +	struct page *busy;
> > > +	struct zone *zone;
> > 
> > Any specific reason to move zone declaration down here? Not that it
> > matters much. Just curious.
> 
> I find this arrangement more readable, I believe Ingo Molnar called it
> the reverse christmas tree once :-).  Longest lines first, then sort
> lines of equal length alphabetically.
> 
> And since it was basically complete, except for @zone, I just HAD to!

:)

> 
> > > @@ -3639,16 +3618,16 @@ static int mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
> > >  			spin_unlock_irqrestore(&zone->lru_lock, flags);
> > >  			break;
> > >  		}
> > > -		pc = list_entry(list->prev, struct page_cgroup, lru);
> > > -		if (busy == pc) {
> > > -			list_move(&pc->lru, list);
> > > +		page = list_entry(list->prev, struct page, lru);
> > > +		if (busy == page) {
> > > +			list_move(&page->lru, list);
> > >  			busy = NULL;
> > >  			spin_unlock_irqrestore(&zone->lru_lock, flags);
> > >  			continue;
> > >  		}
> > >  		spin_unlock_irqrestore(&zone->lru_lock, flags);
> > >  
> > > -		page = lookup_cgroup_page(pc);
> > > +		pc = lookup_page_cgroup(page);
> > 
> > lookup_page_cgroup might return NULL so we probably want BUG_ON(!pc)
> > here. We are not very consistent about checking the return value,
> > though.
> 
> I think this is a myth and we should remove all those checks.  How can
> pages circulate in userspace before they are fully onlined and their
> page_cgroup buddies allocated?  In this case: how would they have been
> charged in the first place and sit on a list without a list_head? :-)

Yes, that is right. This should never happen (last famous words). I can
imagine that a memory offlinening bug could cause issues.

Anyway the more appropriate way to handle that would BUG_ON directly in
lookup_page_cgroup.

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@suse.cz>
To: Johannes Weiner <jweiner@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	Balbir Singh <bsingharora@gmail.com>,
	Ying Han <yinghan@google.com>, Greg Thelen <gthelen@google.com>,
	Michel Lespinasse <walken@google.com>,
	Rik van Riel <riel@redhat.com>,
	Minchan Kim <minchan.kim@gmail.com>,
	Christoph Hellwig <hch@infradead.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch 10/11] mm: make per-memcg LRU lists exclusive
Date: Wed, 21 Sep 2011 18:05:18 +0200	[thread overview]
Message-ID: <20110921160518.GK8501@tiehlicka.suse.cz> (raw)
In-Reply-To: <20110921154745.GA25828@redhat.com>

On Wed 21-09-11 17:47:45, Johannes Weiner wrote:
> On Wed, Sep 21, 2011 at 05:24:58PM +0200, Michal Hocko wrote:
> > On Mon 12-09-11 12:57:27, Johannes Weiner wrote:
[...]
> > > @@ -934,115 +954,123 @@ EXPORT_SYMBOL(mem_cgroup_count_vm_event);
> > >   * When moving account, the page is not on LRU. It's isolated.
> > >   */
> > >  
> > > -struct page *mem_cgroup_lru_to_page(struct zone *zone, struct mem_cgroup *mem,
> > > -				    enum lru_list lru)
> > > +/**
> > > + * mem_cgroup_lru_add_list - account for adding an lru page and return lruvec
> > > + * @zone: zone of the page
> > > + * @page: the page
> > > + * @lru: current lru
> > > + *
> > > + * This function accounts for @page being added to @lru, and returns
> > > + * the lruvec for the given @zone and the memcg @page is charged to.
> > > + *
> > > + * The callsite is then responsible for physically linking the page to
> > > + * the returned lruvec->lists[@lru].
> > > + */
> > > +struct lruvec *mem_cgroup_lru_add_list(struct zone *zone, struct page *page,
> > > +				       enum lru_list lru)
> > 
> > I know that names are alway tricky but what about mem_cgroup_acct_lru_add?
> > Analogously for mem_cgroup_lru_del_list, mem_cgroup_lru_del and
> > mem_cgroup_lru_move_lists.
> 
> Hmm, but it doesn't just lru-account, it also looks up the right
> lruvec for the caller to link the page to, so it's not necessarily an
> improvement, although I agree that the name could be better.

Sorry, I do not have any better idea. I would just like if the name
didn't suggest that we actually modify the list.

> 
> > > @@ -3615,11 +3593,11 @@ unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
> > >  static int mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
> > >  				int node, int zid, enum lru_list lru)
> > >  {
> > > -	struct zone *zone;
> > >  	struct mem_cgroup_per_zone *mz;
> > > -	struct page_cgroup *pc, *busy;
> > >  	unsigned long flags, loop;
> > >  	struct list_head *list;
> > > +	struct page *busy;
> > > +	struct zone *zone;
> > 
> > Any specific reason to move zone declaration down here? Not that it
> > matters much. Just curious.
> 
> I find this arrangement more readable, I believe Ingo Molnar called it
> the reverse christmas tree once :-).  Longest lines first, then sort
> lines of equal length alphabetically.
> 
> And since it was basically complete, except for @zone, I just HAD to!

:)

> 
> > > @@ -3639,16 +3618,16 @@ static int mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
> > >  			spin_unlock_irqrestore(&zone->lru_lock, flags);
> > >  			break;
> > >  		}
> > > -		pc = list_entry(list->prev, struct page_cgroup, lru);
> > > -		if (busy == pc) {
> > > -			list_move(&pc->lru, list);
> > > +		page = list_entry(list->prev, struct page, lru);
> > > +		if (busy == page) {
> > > +			list_move(&page->lru, list);
> > >  			busy = NULL;
> > >  			spin_unlock_irqrestore(&zone->lru_lock, flags);
> > >  			continue;
> > >  		}
> > >  		spin_unlock_irqrestore(&zone->lru_lock, flags);
> > >  
> > > -		page = lookup_cgroup_page(pc);
> > > +		pc = lookup_page_cgroup(page);
> > 
> > lookup_page_cgroup might return NULL so we probably want BUG_ON(!pc)
> > here. We are not very consistent about checking the return value,
> > though.
> 
> I think this is a myth and we should remove all those checks.  How can
> pages circulate in userspace before they are fully onlined and their
> page_cgroup buddies allocated?  In this case: how would they have been
> charged in the first place and sit on a list without a list_head? :-)

Yes, that is right. This should never happen (last famous words). I can
imagine that a memory offlinening bug could cause issues.

Anyway the more appropriate way to handle that would BUG_ON directly in
lookup_page_cgroup.

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-09-21 16:05 UTC|newest]

Thread overview: 130+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-12 10:57 [patch 0/11] mm: memcg naturalization -rc3 Johannes Weiner
2011-09-12 10:57 ` Johannes Weiner
2011-09-12 10:57 ` [patch 01/11] mm: memcg: consolidate hierarchy iteration primitives Johannes Weiner
2011-09-12 10:57   ` Johannes Weiner
2011-09-12 22:37   ` Kirill A. Shutemov
2011-09-12 22:37     ` Kirill A. Shutemov
2011-09-13  5:40     ` Johannes Weiner
2011-09-13  5:40       ` Johannes Weiner
2011-09-19 13:06     ` Michal Hocko
2011-09-19 13:06       ` Michal Hocko
2011-09-13 10:06   ` KAMEZAWA Hiroyuki
2011-09-13 10:06     ` KAMEZAWA Hiroyuki
2011-09-19 12:53   ` Michal Hocko
2011-09-19 12:53     ` Michal Hocko
2011-09-20  8:45     ` Johannes Weiner
2011-09-20  8:45       ` Johannes Weiner
2011-09-20  8:53       ` Michal Hocko
2011-09-20  8:53         ` Michal Hocko
2011-09-12 10:57 ` [patch 02/11] mm: vmscan: distinguish global reclaim from global LRU scanning Johannes Weiner
2011-09-12 10:57   ` Johannes Weiner
2011-09-12 23:02   ` Kirill A. Shutemov
2011-09-12 23:02     ` Kirill A. Shutemov
2011-09-13  5:48     ` Johannes Weiner
2011-09-13  5:48       ` Johannes Weiner
2011-09-13 10:07   ` KAMEZAWA Hiroyuki
2011-09-13 10:07     ` KAMEZAWA Hiroyuki
2011-09-19 13:23   ` Michal Hocko
2011-09-19 13:23     ` Michal Hocko
2011-09-19 13:46     ` Michal Hocko
2011-09-19 13:46       ` Michal Hocko
2011-09-20  8:52     ` Johannes Weiner
2011-09-20  8:52       ` Johannes Weiner
2011-09-12 10:57 ` [patch 03/11] mm: vmscan: distinguish between memcg triggering reclaim and memcg being scanned Johannes Weiner
2011-09-12 10:57   ` Johannes Weiner
2011-09-13 10:23   ` KAMEZAWA Hiroyuki
2011-09-13 10:23     ` KAMEZAWA Hiroyuki
2011-09-19 14:29   ` Michal Hocko
2011-09-19 14:29     ` Michal Hocko
2011-09-20  8:58     ` Johannes Weiner
2011-09-20  8:58       ` Johannes Weiner
2011-09-20  9:17       ` Michal Hocko
2011-09-20  9:17         ` Michal Hocko
2011-09-29  7:55         ` Johannes Weiner
2011-09-29  7:55           ` Johannes Weiner
2011-09-12 10:57 ` [patch 04/11] mm: memcg: per-priority per-zone hierarchy scan generations Johannes Weiner
2011-09-12 10:57   ` Johannes Weiner
2011-09-13 10:27   ` KAMEZAWA Hiroyuki
2011-09-13 10:27     ` KAMEZAWA Hiroyuki
2011-09-13 11:03     ` Johannes Weiner
2011-09-13 11:03       ` Johannes Weiner
2011-09-14  0:55       ` KAMEZAWA Hiroyuki
2011-09-14  0:55         ` KAMEZAWA Hiroyuki
2011-09-14  5:56         ` Johannes Weiner
2011-09-14  5:56           ` Johannes Weiner
2011-09-14  7:40           ` KAMEZAWA Hiroyuki
2011-09-14  7:40             ` KAMEZAWA Hiroyuki
2011-09-20  8:15       ` Michal Hocko
2011-09-20  8:15         ` Michal Hocko
2011-09-20  8:45   ` Michal Hocko
2011-09-20  8:45     ` Michal Hocko
2011-09-20  9:10     ` Johannes Weiner
2011-09-20  9:10       ` Johannes Weiner
2011-09-20 12:37       ` Michal Hocko
2011-09-20 12:37         ` Michal Hocko
2011-09-12 10:57 ` [patch 05/11] mm: move memcg hierarchy reclaim to generic reclaim code Johannes Weiner
2011-09-12 10:57   ` Johannes Weiner
2011-09-13 10:31   ` KAMEZAWA Hiroyuki
2011-09-13 10:31     ` KAMEZAWA Hiroyuki
2011-09-20 13:09   ` Michal Hocko
2011-09-20 13:09     ` Michal Hocko
2011-09-20 13:29     ` Johannes Weiner
2011-09-20 13:29       ` Johannes Weiner
2011-09-20 14:08       ` Michal Hocko
2011-09-20 14:08         ` Michal Hocko
2011-09-12 10:57 ` [patch 06/11] mm: memcg: remove optimization of keeping the root_mem_cgroup LRU lists empty Johannes Weiner
2011-09-12 10:57   ` Johannes Weiner
2011-09-13 10:34   ` KAMEZAWA Hiroyuki
2011-09-13 10:34     ` KAMEZAWA Hiroyuki
2011-09-20 15:02   ` Michal Hocko
2011-09-20 15:02     ` Michal Hocko
2011-09-29  9:20     ` Johannes Weiner
2011-09-29  9:20       ` Johannes Weiner
2011-09-29  9:49       ` Michal Hocko
2011-09-29  9:49         ` Michal Hocko
2011-09-12 10:57 ` [patch 07/11] mm: vmscan: convert unevictable page rescue scanner to per-memcg LRU lists Johannes Weiner
2011-09-12 10:57   ` Johannes Weiner
2011-09-13 10:37   ` KAMEZAWA Hiroyuki
2011-09-13 10:37     ` KAMEZAWA Hiroyuki
2011-09-21 12:33   ` Michal Hocko
2011-09-21 12:33     ` Michal Hocko
2011-09-21 13:47     ` Johannes Weiner
2011-09-21 13:47       ` Johannes Weiner
2011-09-21 14:08       ` Michal Hocko
2011-09-21 14:08         ` Michal Hocko
2011-09-12 10:57 ` [patch 08/11] mm: vmscan: convert global reclaim " Johannes Weiner
2011-09-12 10:57   ` Johannes Weiner
2011-09-13 10:41   ` KAMEZAWA Hiroyuki
2011-09-13 10:41     ` KAMEZAWA Hiroyuki
2011-09-21 13:10   ` Michal Hocko
2011-09-21 13:10     ` Michal Hocko
2011-09-21 13:51     ` Johannes Weiner
2011-09-21 13:51       ` Johannes Weiner
2011-09-21 13:57       ` Michal Hocko
2011-09-21 13:57         ` Michal Hocko
2011-09-12 10:57 ` [patch 09/11] mm: collect LRU list heads into struct lruvec Johannes Weiner
2011-09-12 10:57   ` Johannes Weiner
2011-09-13 10:43   ` KAMEZAWA Hiroyuki
2011-09-13 10:43     ` KAMEZAWA Hiroyuki
2011-09-21 13:43   ` Michal Hocko
2011-09-21 13:43     ` Michal Hocko
2011-09-21 15:15     ` Michal Hocko
2011-09-21 15:15       ` Michal Hocko
2011-09-12 10:57 ` [patch 10/11] mm: make per-memcg LRU lists exclusive Johannes Weiner
2011-09-12 10:57   ` Johannes Weiner
2011-09-13 10:47   ` KAMEZAWA Hiroyuki
2011-09-13 10:47     ` KAMEZAWA Hiroyuki
2011-09-21 15:24   ` Michal Hocko
2011-09-21 15:24     ` Michal Hocko
2011-09-21 15:47     ` Johannes Weiner
2011-09-21 15:47       ` Johannes Weiner
2011-09-21 16:05       ` Michal Hocko [this message]
2011-09-21 16:05         ` Michal Hocko
2011-09-12 10:57 ` [patch 11/11] mm: memcg: remove unused node/section info from pc->flags Johannes Weiner
2011-09-12 10:57   ` Johannes Weiner
2011-09-13 10:50   ` KAMEZAWA Hiroyuki
2011-09-13 10:50     ` KAMEZAWA Hiroyuki
2011-09-21 15:32   ` Michal Hocko
2011-09-21 15:32     ` Michal Hocko
2011-09-13 20:35 ` [patch 0/11] mm: memcg naturalization -rc3 Kirill A. Shutemov
2011-09-13 20:35   ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110921160518.GK8501@tiehlicka.suse.cz \
    --to=mhocko@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=bsingharora@gmail.com \
    --cc=gthelen@google.com \
    --cc=hch@infradead.org \
    --cc=jweiner@redhat.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=riel@redhat.com \
    --cc=walken@google.com \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.