[RFC][PATCH 0/2] memcg: reduce overhead by coalescing css_get/put

* [RFC][PATCH 0/2] memcg: reduce overhead by coalescing css_get/put
@ 2010-06-03  9:54 KAMEZAWA Hiroyuki
  2010-06-03  9:56 ` [RFC][PATCH 1/2] memcg: coalescing css_get() at charge KAMEZAWA Hiroyuki
  2010-06-03  9:57 ` [RFC][PATCH 2/2] memcg: coalescing css_put KAMEZAWA Hiroyuki
  0 siblings, 2 replies; 3+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-06-03  9:54 UTC (permalink / raw)
  To: linux-mm; +Cc: nishimura, balbir

This is now under development patch (and I can't guarantee this is free from bug.)

The idea is coalescing multiple css_get/put to __css_get(),__css_put() as
we now do in res_counter charging.

Here is a result with multi-threaded page fault program. The program does continuous 
page fault in 60 sec. If the kernel works better, we can see more page faults.

Here is a test result under a memcg(not root cgroup).

[before Patch]
[root@bluextal test]# /root/bin/perf stat -e page-faults,cache-misses ./multi-fault-all-split 8

 Performance counter stats for './multi-fault-all-split 8':

           12357708  page-faults
          161332057  cache-misses

       60.007931275  seconds time elapsed
    25.31%  multi-fault-all  [kernel.kallsyms]      [k] clear_page_c
     9.24%  multi-fault-all  [kernel.kallsyms]      [k] down_read_trylock
     8.37%  multi-fault-all  [kernel.kallsyms]      [k] try_get_mem_cgroup_from_mm
     5.21%  multi-fault-all  [kernel.kallsyms]      [k] __alloc_pages_nodemask
     5.13%  multi-fault-all  [kernel.kallsyms]      [k] _raw_spin_lock_irqsave
     4.91%  multi-fault-all  [kernel.kallsyms]      [k] __css_put
     4.66%  multi-fault-all  [kernel.kallsyms]      [k] up_read
     3.17%  multi-fault-all  [kernel.kallsyms]      [k] css_put
     2.77%  multi-fault-all  [kernel.kallsyms]      [k] _raw_spin_lock_irq
     2.58%  multi-fault-all  [kernel.kallsyms]      [k] page_fault

[after Patch]
[root@bluextal test]#  /root/bin/perf stat -e page-faults,cache-misses ./multi-fault-all-split 8

 Performance counter stats for './multi-fault-all-split 8':

           13615258  page-faults
          153207110  cache-misses

       60.004117823  seconds time elapsed

# Overhead          Command          Shared Object  Symbol
# ........  ...............  .....................  ......
#
    27.70%  multi-fault-all  [kernel.kallsyms]      [k] clear_page_c
    11.18%  multi-fault-all  [kernel.kallsyms]      [k] down_read_trylock
     7.54%  multi-fault-all  [kernel.kallsyms]      [k] _raw_spin_lock_irqsave
     5.99%  multi-fault-all  [kernel.kallsyms]      [k] up_read
     5.90%  multi-fault-all  [kernel.kallsyms]      [k] __alloc_pages_nodemask
     5.13%  multi-fault-all  [kernel.kallsyms]      [k] _raw_spin_lock_irq
     2.73%  multi-fault-all  [kernel.kallsyms]      [k] __mem_cgroup_commit_charge
     2.71%  multi-fault-all  [kernel.kallsyms]      [k] page_fault
     2.66%  multi-fault-all  [kernel.kallsyms]      [k] handle_mm_fault
     2.35%  multi-fault-all  [kernel.kallsyms]      [k] _raw_spin_lock

You can see cache-miss/page-faults is improved and no css_get/css_put in overhead
stat record. Please give me your review if interested.

(I tried to get rid of css_get()/put() per a page ...but..it seems no very easy.
 So, now trying to reduce overheads.)

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread