All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] fix stale swap cache account leak  in memcg v7
@ 2009-05-12  1:44 ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 52+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-05-12  1:44 UTC (permalink / raw)
  To: linux-mm; +Cc: balbir, nishimura, akpm, mingo, linux-kernel

I hope this version gets acks..
==
As Nishimura reported, there is a race at handling swap cache.

Typical cases are following (from Nishimura's mail)


== Type-1 ==
  If some pages of processA has been swapped out, it calls free_swap_and_cache().
  And if at the same time, processB is calling read_swap_cache_async() about
  a swap entry *that is used by processA*, a race like below can happen.

            processA                   |           processB
  -------------------------------------+-------------------------------------
    (free_swap_and_cache())            |  (read_swap_cache_async())
                                       |    swap_duplicate()
                                       |    __set_page_locked()
                                       |    add_to_swap_cache()
      swap_entry_free() == 0           |
      find_get_page() -> found         |
      try_lock_page() -> fail & return |
                                       |    lru_cache_add_anon()
                                       |      doesn't link this page to memcg's
                                       |      LRU, because of !PageCgroupUsed.

  This type of leak can be avoided by setting /proc/sys/vm/page-cluster to 0.


== Type-2 ==
    Assume processA is exiting and pte points to a page(!PageSwapCache).
    And processB is trying reclaim the page.

              processA                   |           processB
    -------------------------------------+-------------------------------------
      (page_remove_rmap())               |  (shrink_page_list())
         mem_cgroup_uncharge_page()      |
            ->uncharged because it's not |
              PageSwapCache yet.         |
              So, both mem/memsw.usage   |
              are decremented.           |
                                         |    add_to_swap() -> added to swap cache.

    If this page goes thorough without being freed for some reason, this page
    doesn't goes back to memcg's LRU because of !PageCgroupUsed.


Considering Type-1, it's better to avoid swapin-readahead when memcg is used.
swapin-readahead just read swp_entries which are near to requested entry. So,
pages not to be used can be on memory (on global LRU). When memcg is used,
this is not good behavior anyway.

Considering Type-2, the page should be freed from SwapCache right after WriteBack.
Free swapped out pages as soon as possible is a good nature to memcg, anyway.

The patch set includes followng
 [1/3] add mem_cgroup_is_activated() function. which tell us memcg is _really_ used.
 [2/3] fix swap cache handling race by avoidng readahead.
 [3/3] fix swap cache handling race by check swapcount again.

Result is good under my test.

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2009-05-15  1:17 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-12  1:44 [PATCH 0/3] fix stale swap cache account leak in memcg v7 KAMEZAWA Hiroyuki
2009-05-12  1:44 ` KAMEZAWA Hiroyuki
2009-05-12  1:45 ` [PATCH 1/3] add check for mem cgroup is activated KAMEZAWA Hiroyuki
2009-05-12  1:45   ` KAMEZAWA Hiroyuki
2009-05-12  1:46 ` [PATCH 2/3] fix swap cache account leak at swapin-readahead KAMEZAWA Hiroyuki
2009-05-12  1:46   ` KAMEZAWA Hiroyuki
2009-05-12  4:32   ` Daisuke Nishimura
2009-05-12  4:32     ` Daisuke Nishimura
2009-05-12 11:24   ` Johannes Weiner
2009-05-12 11:24     ` Johannes Weiner
2009-05-12 23:58     ` KAMEZAWA Hiroyuki
2009-05-12 23:58       ` KAMEZAWA Hiroyuki
2009-05-13 11:18       ` Johannes Weiner
2009-05-13 11:18         ` Johannes Weiner
2009-05-13 18:03         ` Hugh Dickins
2009-05-13 18:03           ` Hugh Dickins
2009-05-14  0:05           ` KAMEZAWA Hiroyuki
2009-05-14  0:05             ` KAMEZAWA Hiroyuki
2009-05-12  1:47 ` [PATCH 3/3] fix stale swap cache at writeback KAMEZAWA Hiroyuki
2009-05-12  1:47   ` KAMEZAWA Hiroyuki
2009-05-12  5:06 ` [PATCH 4/3] memcg: call uncharge_swapcache outside of tree_lock (Re: [PATCH 0/3] fix stale swap cache account leak in memcg v7) Daisuke Nishimura
2009-05-12  5:06   ` Daisuke Nishimura
2009-05-12  7:09   ` KAMEZAWA Hiroyuki
2009-05-12  7:09     ` KAMEZAWA Hiroyuki
2009-05-12  8:00     ` Daisuke Nishimura
2009-05-12  8:00       ` Daisuke Nishimura
2009-05-12  8:13       ` [PATCH][BUGFIX] memcg: fix for deadlock between lock_page_cgroup and mapping tree_lock KAMEZAWA Hiroyuki
2009-05-12  8:13         ` KAMEZAWA Hiroyuki
2009-05-12 10:58         ` Daisuke Nishimura
2009-05-12 10:58           ` Daisuke Nishimura
2009-05-12 23:59           ` KAMEZAWA Hiroyuki
2009-05-12 23:59             ` KAMEZAWA Hiroyuki
2009-05-13  0:28             ` Daisuke Nishimura
2009-05-13  0:28               ` Daisuke Nishimura
2009-05-13  0:32               ` KAMEZAWA Hiroyuki
2009-05-13  0:32                 ` KAMEZAWA Hiroyuki
2009-05-13  3:55                 ` KAMEZAWA Hiroyuki
2009-05-13  3:55                   ` KAMEZAWA Hiroyuki
2009-05-13  4:11                   ` nishimura
2009-05-13  4:11                     ` nishimura
2009-05-12  9:51 ` [PATCH 0/3] fix stale swap cache account leak in memcg v7 Balbir Singh
2009-05-12  9:51   ` Balbir Singh
2009-05-13  0:31   ` KAMEZAWA Hiroyuki
2009-05-13  0:31     ` KAMEZAWA Hiroyuki
2009-05-14 23:47     ` KAMEZAWA Hiroyuki
2009-05-14 23:47       ` KAMEZAWA Hiroyuki
2009-05-15  0:38       ` Daisuke Nishimura
2009-05-15  0:38         ` Daisuke Nishimura
2009-05-15  0:54         ` KAMEZAWA Hiroyuki
2009-05-15  0:54           ` KAMEZAWA Hiroyuki
2009-05-15  1:12           ` Daisuke Nishimura
2009-05-15  1:12             ` Daisuke Nishimura

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.