All of lore.kernel.org
 help / color / mirror / Atom feed
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: balbir@linux.vnet.ibm.com
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"hugh@veritas.com" <hugh@veritas.com>
Subject: Re: [RFC][PATCH] fix swap entries is not reclaimed in proper way for memg v3.
Date: Mon, 27 Apr 2009 17:49:44 +0900	[thread overview]
Message-ID: <20090427174944.86dbb94c.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <20090427084347.GJ4454@balbir.in.ibm.com>

On Mon, 27 Apr 2009 14:13:47 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> > I like to. But there is no space to record it as stale. And "race" makes
> > that difficult even if we have enough space. If you read the whole thread,
> > you know there are many patterns of race.
> 
> There have been several iterations of this discussion, summarizing it
> would be nice, let me find the thread.
> 
At first, it's obious that there are no free space in swap entry array and
swap_cgroup array. (And this can be trouble even if MEM_RES_CONTROLLER_SWAP_EXT
is not used.)

I tried to record "stale" information to page_cgroup with flag, but there is
following sequence and I can't do it.

==
     CPU0(zap_pte)                 CPU1 (read swap)
                                  swap_duplicate()
     free_swapentry()
                                  add_to_swap_cache().
==
In this case, we can't know swap_entry is stale or not at zap_pte().



> > 
> > > 2. Can't we reclaim stale entries during memcg LRU reclaim? Why write
> > > a GC for it?
> > > 
> > Because they are not on memcg LRU. we can't reclaim it by memcg LRU.
> > (See the first mail from Nishimura of this thread. It explains well.)
> >
> 
> Hmm.. I don't find it, let me do a more exhaustive search on the web.
> If the entry is stale and not on memcg LRU, it is still accounted to
> the memcg?
yes. accoutned to memcg.memsw.usage_in_bytes.


>  
> > One easy case is here.
> > 
> >   - CPU0 call zap_pte()->free_swap_and_cache()
> >   - CPU1 tries to swap-in it.
> >   In this case, free_swap_and_cache() doesn't free swp_entry and swp_entry
> >   is read into the memory. But it will never be added memcg's LRU until
> >   it's mapped.
> 
> That is strange.. not even added to the LRU as a cached page?
> 
added to "global" LRU but not to "memcg's LRU" because "USED" bit is not set.


> >   (What we have to consider here is swapin-readahead. It can swap-in memory
> >    even if it's not accessed. Then, this race window is larger than expected.)
> > 
> > We can't use memcg's LRU then...what we can do is.
> > 
> >  - scanning global LRU all
> >  or
> >  - use some trick to reclaim them in lazy way.
> >
> 
> Thanks for being patient, some of these questions have been discussed
> before I suppose. Let me dig out the thread. 
> 

Sorry for lack of explanation. I'll add more text to v4. patch.

Thanks,
-kame


WARNING: multiple messages have this Message-ID (diff)
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: balbir@linux.vnet.ibm.com
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"hugh@veritas.com" <hugh@veritas.com>
Subject: Re: [RFC][PATCH] fix swap entries is not reclaimed in proper way for memg v3.
Date: Mon, 27 Apr 2009 17:49:44 +0900	[thread overview]
Message-ID: <20090427174944.86dbb94c.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <20090427084347.GJ4454@balbir.in.ibm.com>

On Mon, 27 Apr 2009 14:13:47 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> > I like to. But there is no space to record it as stale. And "race" makes
> > that difficult even if we have enough space. If you read the whole thread,
> > you know there are many patterns of race.
> 
> There have been several iterations of this discussion, summarizing it
> would be nice, let me find the thread.
> 
At first, it's obious that there are no free space in swap entry array and
swap_cgroup array. (And this can be trouble even if MEM_RES_CONTROLLER_SWAP_EXT
is not used.)

I tried to record "stale" information to page_cgroup with flag, but there is
following sequence and I can't do it.

==
     CPU0(zap_pte)                 CPU1 (read swap)
                                  swap_duplicate()
     free_swapentry()
                                  add_to_swap_cache().
==
In this case, we can't know swap_entry is stale or not at zap_pte().



> > 
> > > 2. Can't we reclaim stale entries during memcg LRU reclaim? Why write
> > > a GC for it?
> > > 
> > Because they are not on memcg LRU. we can't reclaim it by memcg LRU.
> > (See the first mail from Nishimura of this thread. It explains well.)
> >
> 
> Hmm.. I don't find it, let me do a more exhaustive search on the web.
> If the entry is stale and not on memcg LRU, it is still accounted to
> the memcg?
yes. accoutned to memcg.memsw.usage_in_bytes.


>  
> > One easy case is here.
> > 
> >   - CPU0 call zap_pte()->free_swap_and_cache()
> >   - CPU1 tries to swap-in it.
> >   In this case, free_swap_and_cache() doesn't free swp_entry and swp_entry
> >   is read into the memory. But it will never be added memcg's LRU until
> >   it's mapped.
> 
> That is strange.. not even added to the LRU as a cached page?
> 
added to "global" LRU but not to "memcg's LRU" because "USED" bit is not set.


> >   (What we have to consider here is swapin-readahead. It can swap-in memory
> >    even if it's not accessed. Then, this race window is larger than expected.)
> > 
> > We can't use memcg's LRU then...what we can do is.
> > 
> >  - scanning global LRU all
> >  or
> >  - use some trick to reclaim them in lazy way.
> >
> 
> Thanks for being patient, some of these questions have been discussed
> before I suppose. Let me dig out the thread. 
> 

Sorry for lack of explanation. I'll add more text to v4. patch.

Thanks,
-kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-04-27  8:51 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-21  7:21 [RFC][PATCH] fix swap entries is not reclaimed in proper way for mem+swap controller KAMEZAWA Hiroyuki
2009-04-21  7:21 ` KAMEZAWA Hiroyuki
2009-04-22  5:38 ` Daisuke Nishimura
2009-04-22  5:38   ` Daisuke Nishimura
2009-04-22  6:10   ` KAMEZAWA Hiroyuki
2009-04-22  6:10     ` KAMEZAWA Hiroyuki
2009-04-23  4:14   ` Daisuke Nishimura
2009-04-23  4:14     ` Daisuke Nishimura
2009-04-23  8:45     ` KAMEZAWA Hiroyuki
2009-04-23  8:45       ` KAMEZAWA Hiroyuki
2009-04-24  4:33   ` KAMEZAWA Hiroyuki
2009-04-24  4:33     ` KAMEZAWA Hiroyuki
2009-04-24  6:21     ` Daisuke Nishimura
2009-04-24  6:21       ` Daisuke Nishimura
2009-04-24  7:28       ` [RFC][PATCH] fix swap entries is not reclaimed in proper way for memg v3 KAMEZAWA Hiroyuki
2009-04-24  7:28         ` KAMEZAWA Hiroyuki
2009-04-24  8:07         ` Daisuke Nishimura
2009-04-24  8:07           ` Daisuke Nishimura
2009-04-25 12:54         ` Daisuke Nishimura
2009-04-25 12:54           ` Daisuke Nishimura
2009-04-25 16:06           ` Daisuke Nishimura
2009-04-25 16:06             ` Daisuke Nishimura
2009-04-27  7:39             ` KAMEZAWA Hiroyuki
2009-04-27  7:39               ` KAMEZAWA Hiroyuki
2009-04-27  8:12         ` Balbir Singh
2009-04-27  8:12           ` Balbir Singh
2009-04-27  8:21           ` KAMEZAWA Hiroyuki
2009-04-27  8:21             ` KAMEZAWA Hiroyuki
2009-04-27  8:43             ` Balbir Singh
2009-04-27  8:43               ` Balbir Singh
2009-04-27  8:49               ` KAMEZAWA Hiroyuki [this message]
2009-04-27  8:49                 ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090427174944.86dbb94c.kamezawa.hiroyu@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nishimura@mxp.nes.nec.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.