All of lore.kernel.org
 help / color / mirror / Atom feed
* Transparent Hugepage Support #25
@ 2010-05-21  0:05 Andrea Arcangeli
  2010-05-21  3:26   ` Eric Dumazet
  2010-06-02  5:44   ` Daisuke Nishimura
  0 siblings, 2 replies; 16+ messages in thread
From: Andrea Arcangeli @ 2010-05-21  0:05 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Marcelo Tosatti, Adam Litke, Avi Kivity, Izik Eidus,
	Hugh Dickins, Nick Piggin, Rik van Riel, Mel Gorman, Dave Hansen,
	Benjamin Herrenschmidt, Ingo Molnar, Mike Travis,
	KAMEZAWA Hiroyuki, Christoph Lameter, Chris Wright, bpicco,
	KOSAKI Motohiro, Balbir Singh, Michael S. Tsirkin,
	Peter Zijlstra, Johannes Weiner, Daisuke Nishimura, Chris Mason,
	Borislav Petkov

[-- Attachment #1: Type: text/plain, Size: 8304 bytes --]

If you're running scientific applications, JVM or large gcc builds
(see attached patch for gcc), and you want to run from 2.5% faster for
kernel build (on bare metal), or 8% faster in translate.o of qemu (on
bare metal), 15% faster or more with virt and Intel EPT/ AMD NPT
(depending on the workload), you should apply and run the transparent
hugepage support on your systems.

Awesome results have already been posted on lkml, if you test and
benchmark it, please provide any positive/negative real-life result on
lkml (or privately to me if you prefer). The more testing the better.

By running your scientific apps up to ~10% faster (or more if you use
virt), and in turn by boosting the performance of the virtualized
cloud, you will save energy. NOTE: it can cost memory in some cases,
but this is why a madvise(MADV_HUGEPAGE) exists, so THP can be
selectively enabled on the regions where the app knows there will be
zero memory waste in boosting performance (like KVM).

If you have more memory than you need as filesystem cache you can
choose "always" mode, while if you're ram constrained or you need as
much filesystem cache as possible but you still want a CPU boost in
the madvise regions without risking reducing the cache you should
choose "madvise". All settings can be later tuned with sysfs after
boot in /sys/kernel/mm/transparent_hugepage/ . You can monitor the
THP utilization system-wide with "grep Anon /proc/meminfo".

http://git.kernel.org/?p=linux/kernel/git/andrea/aa.git;a=shortlog
http://git.kernel.org/?p=linux/kernel/git/andrea/aa.git;a=shortlog;h=refs/heads/anon_vma_chain

first: git clone git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git
or first: git clone --reference linux-2.6 git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git
later: git fetch; git checkout -f origin/master
or to run the new anon_vma_chain: git fetch; git checkout -f origin/anon_vma_chain

I am currently running the origin/anon_vma_chain branch on all my
systems here (keeping master around only in case of troubles with the
new anon-vma code).

The tree is rebased and git pull won't work.

http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.34/transparent_hugepage-25/
http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.34/transparent_hugepage-25.gz
http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.34/transparent_hugepage-25-anon_vma_chain.gz

Diff #24 -> #25:

 b/exec-migrate-race-anon_vma-chain                       |  198 ++++++---------

Return to the cleaner fix that really allows the rmap_walk to succeed
at all times (and it also allows migrate and split_huge_page at all
times) without modifying the rmap_walk for this corner in execve. This
is also more robust for the long term in case the user stack starts
huge and we teach mremap to migrate without splitting hugepages (the
stack may have to be splitted by some other operation in the
VM).

 b/gfp_nomemalloc_wait                                    |   19 -

Fix: still clear ALLOC_CPUSET if the allocation is atomic.

 b/memory-compaction-anon-migrate-anon-vma-chain          |   49 +++
 b/memory-compaction-anon-vma-refcount-anon-vma-chain     |  161 ++++++++----
 b/memory-compaction-migrate-swapcache-anon-vma-chain     |  107 ++++++++
 memory-compaction-anon-vma-share-refcount-anon-vma-chain |  166 ------------

Fix: the anon_vma_chain branch must use drop_anon_vma to be safe with
the anon_vma->root->lock and avoid leaking root anon_vmas.

 b/pte_alloc_trans_splitting                              |   13 

use pmd_none instead of pmd_present in pte_alloc_map to be consistent
with __pte_alloc (pmd_none shall be a bit faster too, and it's
stricter too).

 b/transparent_hugepage                                   |   63 +++-

Race fix in initial huge pmd page fault (virtio-blk+THP was crashing
the host by running "cp /dev/vda /dev/null" in guest, with a 6G ram
guest and 4G ram + 4G swap host, immediately after host started
swapping). I never reproduced it with any other workload apparently so
it went unnoticed for a while (using the default ide emulation instead
of virtio-blk also didn't show any problem at all probably because of
different threading model or different timings). But it's not fixed.

 b/root_anon_vma-mm_take_all_locks                        |   81 ++++++

Prevent deadlock in root-anon-vma locking when registering in mmu
notifier (i.e. starting kvm, but it only has been triggering with
the -daemonize param for some reason, so it was unnoticed before as I
normally run kvm in the foreground).

Diffstat:

 Documentation/cgroups/memory.txt      |    4 
 Documentation/sysctl/vm.txt           |   25 
 Documentation/vm/transhuge.txt        |  283 ++++
 arch/alpha/include/asm/mman.h         |    2 
 arch/mips/include/asm/mman.h          |    2 
 arch/parisc/include/asm/mman.h        |    2 
 arch/powerpc/mm/gup.c                 |   12 
 arch/x86/include/asm/paravirt.h       |   23 
 arch/x86/include/asm/paravirt_types.h |    6 
 arch/x86/include/asm/pgtable-2level.h |    9 
 arch/x86/include/asm/pgtable-3level.h |   23 
 arch/x86/include/asm/pgtable.h        |  144 ++
 arch/x86/include/asm/pgtable_64.h     |   14 
 arch/x86/include/asm/pgtable_types.h  |    3 
 arch/x86/kernel/paravirt.c            |    3 
 arch/x86/kernel/vm86_32.c             |    1 
 arch/x86/kvm/mmu.c                    |   26 
 arch/x86/kvm/paging_tmpl.h            |    4 
 arch/x86/mm/gup.c                     |   25 
 arch/x86/mm/pgtable.c                 |   66 +
 arch/xtensa/include/asm/mman.h        |    2 
 drivers/base/node.c                   |    3 
 fs/Kconfig                            |    2 
 fs/exec.c                             |   37 
 fs/proc/meminfo.c                     |   14 
 fs/proc/page.c                        |   14 
 include/asm-generic/mman-common.h     |    2 
 include/asm-generic/pgtable.h         |  130 ++
 include/linux/compaction.h            |   89 +
 include/linux/gfp.h                   |   14 
 include/linux/huge_mm.h               |  143 ++
 include/linux/khugepaged.h            |   66 +
 include/linux/kvm_host.h              |    4 
 include/linux/memory_hotplug.h        |   14 
 include/linux/migrate.h               |    2 
 include/linux/mm.h                    |   92 +
 include/linux/mm_inline.h             |   13 
 include/linux/mm_types.h              |    3 
 include/linux/mmu_notifier.h          |   40 
 include/linux/mmzone.h                |   10 
 include/linux/page-flags.h            |   36 
 include/linux/rmap.h                  |   58 
 include/linux/sched.h                 |    1 
 include/linux/swap.h                  |    8 
 include/linux/vmstat.h                |    4 
 kernel/fork.c                         |   12 
 kernel/futex.c                        |   67 -
 kernel/sysctl.c                       |   25 
 mm/Kconfig                            |   56 
 mm/Makefile                           |    2 
 mm/compaction.c                       |  620 +++++++++
 mm/huge_memory.c                      | 2172 ++++++++++++++++++++++++++++++++++
 mm/hugetlb.c                          |   69 -
 mm/ksm.c                              |   77 -
 mm/madvise.c                          |    8 
 mm/memcontrol.c                       |   88 -
 mm/memory-failure.c                   |    2 
 mm/memory.c                           |  196 ++-
 mm/memory_hotplug.c                   |   14 
 mm/mempolicy.c                        |   14 
 mm/migrate.c                          |   73 +
 mm/mincore.c                          |  302 ++--
 mm/mmap.c                             |   57 
 mm/mprotect.c                         |   20 
 mm/mremap.c                           |    8 
 mm/page_alloc.c                       |  133 +-
 mm/pagewalk.c                         |    1 
 mm/rmap.c                             |  181 ++
 mm/sparse.c                           |    4 
 mm/swap.c                             |  116 +
 mm/swap_state.c                       |    6 
 mm/swapfile.c                         |    2 
 mm/vmscan.c                           |   42 
 mm/vmstat.c                           |  256 +++-
 virt/kvm/iommu.c                      |    2 
 virt/kvm/kvm_main.c                   |   39 
 76 files changed, 5620 insertions(+), 522 deletions(-)

[-- Attachment #2: gcc-align --]
[-- Type: text/plain, Size: 1501 bytes --]

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---

--- /var/tmp/portage/sys-devel/gcc-4.4.2/work/gcc-4.4.2/gcc/ggc-page.c	2008=
-07-28 16:33:56.000000000 +0200
+++ /tmp/gcc-4.4.2/gcc/ggc-page.c	2010-04-25 06:01:32.829753566 +0200
@@ -450,6 +450,11 @@
 #define BITMAP_SIZE(Num_objects) \
   (CEIL ((Num_objects), HOST_BITS_PER_LONG) * sizeof(long))
=20
+#ifdef __x86_64__
+#define HPAGE_SIZE (2*1024*1024)
+#define GGC_QUIRE_SIZE 512
+#endif
+
 /* Allocate pages in chunks of this size, to throttle calls to memory
    allocation routines.  The first page is used, the rest go onto the
    free list.  This cannot be larger than HOST_BITS_PER_INT for the
@@ -654,6 +659,23 @@
 #ifdef HAVE_MMAP_ANON
   char *page =3D (char *) mmap (pref, size, PROT_READ | PROT_WRITE,
 			      MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+#ifdef HPAGE_SIZE
+  if (!(size & (HPAGE_SIZE-1)) &&
+      page !=3D (char *) MAP_FAILED && (size_t) page & (HPAGE_SIZE-1)) {
+	  char *old_page;
+	  munmap(page, size);
+	  page =3D (char *) mmap (pref, size + HPAGE_SIZE-1,
+				PROT_READ | PROT_WRITE,
+				MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+	  old_page =3D page;
+	  page =3D (char *) (((size_t)page + HPAGE_SIZE-1)
+			   & ~(HPAGE_SIZE-1));
+	  if (old_page !=3D page)
+		  munmap(old_page, page-old_page);
+	  if (page !=3D old_page + HPAGE_SIZE-1)
+		  munmap(page+size, old_page+HPAGE_SIZE-1-page);
+  }
+#endif
 #endif
 #ifdef HAVE_MMAP_DEV_ZERO
   char *page =3D (char *) mmap (pref, size, PROT_READ | PROT_WRITE,

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Transparent Hugepage Support #25
  2010-05-21  0:05 Transparent Hugepage Support #25 Andrea Arcangeli
@ 2010-05-21  3:26   ` Eric Dumazet
  2010-06-02  5:44   ` Daisuke Nishimura
  1 sibling, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2010-05-21  3:26 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: linux-mm, Andrew Morton, linux-kernel, Marcelo Tosatti,
	Adam Litke, Avi Kivity, Izik Eidus, Hugh Dickins, Nick Piggin,
	Rik van Riel, Mel Gorman, Dave Hansen, Benjamin Herrenschmidt,
	Ingo Molnar, Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter,
	Chris Wright, bpicco, KOSAKI Motohiro, Balbir Singh,
	Michael S. Tsirkin, Peter Zijlstra, Johannes Weiner,
	Daisuke Nishimura, Chris Mason, Borislav Petkov

Le vendredi 21 mai 2010 à 02:05 +0200, Andrea Arcangeli a écrit :
> If you're running scientific applications, JVM or large gcc builds
> (see attached patch for gcc), and you want to run from 2.5% faster for
> kernel build (on bare metal), or 8% faster in translate.o of qemu (on
> bare metal), 15% faster or more with virt and Intel EPT/ AMD NPT
> (depending on the workload), you should apply and run the transparent
> hugepage support on your systems.
> 
> Awesome results have already been posted on lkml, if you test and
> benchmark it, please provide any positive/negative real-life result on
> lkml (or privately to me if you prefer). The more testing the better.
> 

Interesting !

Did you tried to change alloc_large_system_hash() to use hugepages for
very large allocations ? We currently use vmalloc() on NUMA machines...

Dentry cache hash table entries: 2097152 (order: 12, 16777216 bytes)
Inode-cache hash table entries: 1048576 (order: 11, 8388608 bytes)
IP route cache hash table entries: 524288 (order: 10, 4194304 bytes)
TCP established hash table entries: 524288 (order: 11, 8388608 bytes)


0xffffc90000003000-0xffffc90001004000 16781312 alloc_large_system_hash+0x1d8/0x280 pages=4096 vmalloc vpages N0=2048 N1=2048
0xffffc9000100f000-0xffffc90001810000 8392704 alloc_large_system_hash+0x1d8/0x280 pages=2048 vmalloc vpages N0=1024 N1=1024
0xffffc90005882000-0xffffc90005c83000 4198400 alloc_large_system_hash+0x1d8/0x280 pages=1024 vmalloc vpages N0=512 N1=512
0xffffc90005c84000-0xffffc90006485000 8392704 alloc_large_system_hash+0x1d8/0x280 pages=2048 vmalloc vpages N0=1024 N1=1024




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Transparent Hugepage Support #25
@ 2010-05-21  3:26   ` Eric Dumazet
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2010-05-21  3:26 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: linux-mm, Andrew Morton, linux-kernel, Marcelo Tosatti,
	Adam Litke, Avi Kivity, Izik Eidus, Hugh Dickins, Nick Piggin,
	Rik van Riel, Mel Gorman, Dave Hansen, Benjamin Herrenschmidt,
	Ingo Molnar, Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter,
	Chris Wright, bpicco, KOSAKI Motohiro, Balbir Singh,
	Michael S. Tsirkin, Peter Zijlstra, Johannes Weiner,
	Daisuke Nishimura, Chris Mason, Borislav Petkov

Le vendredi 21 mai 2010 A  02:05 +0200, Andrea Arcangeli a A(C)crit :
> If you're running scientific applications, JVM or large gcc builds
> (see attached patch for gcc), and you want to run from 2.5% faster for
> kernel build (on bare metal), or 8% faster in translate.o of qemu (on
> bare metal), 15% faster or more with virt and Intel EPT/ AMD NPT
> (depending on the workload), you should apply and run the transparent
> hugepage support on your systems.
> 
> Awesome results have already been posted on lkml, if you test and
> benchmark it, please provide any positive/negative real-life result on
> lkml (or privately to me if you prefer). The more testing the better.
> 

Interesting !

Did you tried to change alloc_large_system_hash() to use hugepages for
very large allocations ? We currently use vmalloc() on NUMA machines...

Dentry cache hash table entries: 2097152 (order: 12, 16777216 bytes)
Inode-cache hash table entries: 1048576 (order: 11, 8388608 bytes)
IP route cache hash table entries: 524288 (order: 10, 4194304 bytes)
TCP established hash table entries: 524288 (order: 11, 8388608 bytes)


0xffffc90000003000-0xffffc90001004000 16781312 alloc_large_system_hash+0x1d8/0x280 pages=4096 vmalloc vpages N0=2048 N1=2048
0xffffc9000100f000-0xffffc90001810000 8392704 alloc_large_system_hash+0x1d8/0x280 pages=2048 vmalloc vpages N0=1024 N1=1024
0xffffc90005882000-0xffffc90005c83000 4198400 alloc_large_system_hash+0x1d8/0x280 pages=1024 vmalloc vpages N0=512 N1=512
0xffffc90005c84000-0xffffc90006485000 8392704 alloc_large_system_hash+0x1d8/0x280 pages=2048 vmalloc vpages N0=1024 N1=1024



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Transparent Hugepage Support #25
  2010-05-21  3:26   ` Eric Dumazet
@ 2010-05-21  5:13     ` Nick Piggin
  -1 siblings, 0 replies; 16+ messages in thread
From: Nick Piggin @ 2010-05-21  5:13 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andrea Arcangeli, linux-mm, Andrew Morton, linux-kernel,
	Marcelo Tosatti, Adam Litke, Avi Kivity, Izik Eidus,
	Hugh Dickins, Rik van Riel, Mel Gorman, Dave Hansen,
	Benjamin Herrenschmidt, Ingo Molnar, Mike Travis,
	KAMEZAWA Hiroyuki, Christoph Lameter, Chris Wright, bpicco,
	KOSAKI Motohiro, Balbir Singh, Michael S. Tsirkin,
	Peter Zijlstra, Johannes Weiner, Daisuke Nishimura, Chris Mason,
	Borislav Petkov

On Fri, May 21, 2010 at 05:26:13AM +0200, Eric Dumazet wrote:
> Le vendredi 21 mai 2010 à 02:05 +0200, Andrea Arcangeli a écrit :
> > If you're running scientific applications, JVM or large gcc builds
> > (see attached patch for gcc), and you want to run from 2.5% faster for
> > kernel build (on bare metal), or 8% faster in translate.o of qemu (on
> > bare metal), 15% faster or more with virt and Intel EPT/ AMD NPT
> > (depending on the workload), you should apply and run the transparent
> > hugepage support on your systems.
> > 
> > Awesome results have already been posted on lkml, if you test and
> > benchmark it, please provide any positive/negative real-life result on
> > lkml (or privately to me if you prefer). The more testing the better.
> > 
> 
> Interesting !
> 
> Did you tried to change alloc_large_system_hash() to use hugepages for
> very large allocations ? We currently use vmalloc() on NUMA machines...
> 
> Dentry cache hash table entries: 2097152 (order: 12, 16777216 bytes)
> Inode-cache hash table entries: 1048576 (order: 11, 8388608 bytes)
> IP route cache hash table entries: 524288 (order: 10, 4194304 bytes)
> TCP established hash table entries: 524288 (order: 11, 8388608 bytes)

Different (easier) kind of problem there.

We should indeed start using hugepages for special vmalloc cases like
this eventually. Last time I checked, we didn't quite have enough memory
per node to do this (ie. it does not end up being interleaved over all
nodes). It probably starts becoming realistic to do this soon with the
rate of memory size increases.

Probably for tuned servers where various hashes are sized very large,
it already makes sese.

It's on my TODO list.

> 
> 
> 0xffffc90000003000-0xffffc90001004000 16781312 alloc_large_system_hash+0x1d8/0x280 pages=4096 vmalloc vpages N0=2048 N1=2048
> 0xffffc9000100f000-0xffffc90001810000 8392704 alloc_large_system_hash+0x1d8/0x280 pages=2048 vmalloc vpages N0=1024 N1=1024
> 0xffffc90005882000-0xffffc90005c83000 4198400 alloc_large_system_hash+0x1d8/0x280 pages=1024 vmalloc vpages N0=512 N1=512
> 0xffffc90005c84000-0xffffc90006485000 8392704 alloc_large_system_hash+0x1d8/0x280 pages=2048 vmalloc vpages N0=1024 N1=1024
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Transparent Hugepage Support #25
@ 2010-05-21  5:13     ` Nick Piggin
  0 siblings, 0 replies; 16+ messages in thread
From: Nick Piggin @ 2010-05-21  5:13 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andrea Arcangeli, linux-mm, Andrew Morton, linux-kernel,
	Marcelo Tosatti, Adam Litke, Avi Kivity, Izik Eidus,
	Hugh Dickins, Rik van Riel, Mel Gorman, Dave Hansen,
	Benjamin Herrenschmidt, Ingo Molnar, Mike Travis,
	KAMEZAWA Hiroyuki, Christoph Lameter, Chris Wright, bpicco,
	KOSAKI Motohiro, Balbir Singh, Michael S. Tsirkin,
	Peter Zijlstra, Johannes Weiner, Daisuke Nishimura, Chris Mason,
	Borislav Petkov

On Fri, May 21, 2010 at 05:26:13AM +0200, Eric Dumazet wrote:
> Le vendredi 21 mai 2010 a 02:05 +0200, Andrea Arcangeli a ecrit :
> > If you're running scientific applications, JVM or large gcc builds
> > (see attached patch for gcc), and you want to run from 2.5% faster for
> > kernel build (on bare metal), or 8% faster in translate.o of qemu (on
> > bare metal), 15% faster or more with virt and Intel EPT/ AMD NPT
> > (depending on the workload), you should apply and run the transparent
> > hugepage support on your systems.
> > 
> > Awesome results have already been posted on lkml, if you test and
> > benchmark it, please provide any positive/negative real-life result on
> > lkml (or privately to me if you prefer). The more testing the better.
> > 
> 
> Interesting !
> 
> Did you tried to change alloc_large_system_hash() to use hugepages for
> very large allocations ? We currently use vmalloc() on NUMA machines...
> 
> Dentry cache hash table entries: 2097152 (order: 12, 16777216 bytes)
> Inode-cache hash table entries: 1048576 (order: 11, 8388608 bytes)
> IP route cache hash table entries: 524288 (order: 10, 4194304 bytes)
> TCP established hash table entries: 524288 (order: 11, 8388608 bytes)

Different (easier) kind of problem there.

We should indeed start using hugepages for special vmalloc cases like
this eventually. Last time I checked, we didn't quite have enough memory
per node to do this (ie. it does not end up being interleaved over all
nodes). It probably starts becoming realistic to do this soon with the
rate of memory size increases.

Probably for tuned servers where various hashes are sized very large,
it already makes sese.

It's on my TODO list.

> 
> 
> 0xffffc90000003000-0xffffc90001004000 16781312 alloc_large_system_hash+0x1d8/0x280 pages=4096 vmalloc vpages N0=2048 N1=2048
> 0xffffc9000100f000-0xffffc90001810000 8392704 alloc_large_system_hash+0x1d8/0x280 pages=2048 vmalloc vpages N0=1024 N1=1024
> 0xffffc90005882000-0xffffc90005c83000 4198400 alloc_large_system_hash+0x1d8/0x280 pages=1024 vmalloc vpages N0=512 N1=512
> 0xffffc90005c84000-0xffffc90006485000 8392704 alloc_large_system_hash+0x1d8/0x280 pages=2048 vmalloc vpages N0=1024 N1=1024
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC][BUGFIX][PATCH 0/2] transhuge-memcg: some fixes (Re: Transparent Hugepage Support #25)
  2010-05-21  0:05 Transparent Hugepage Support #25 Andrea Arcangeli
@ 2010-06-02  5:44   ` Daisuke Nishimura
  2010-06-02  5:44   ` Daisuke Nishimura
  1 sibling, 0 replies; 16+ messages in thread
From: Daisuke Nishimura @ 2010-06-02  5:44 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: linux-mm, Andrew Morton, linux-kernel, Marcelo Tosatti,
	Adam Litke, Avi Kivity, Izik Eidus, Hugh Dickins, Nick Piggin,
	Rik van Riel, Mel Gorman, Dave Hansen, Benjamin Herrenschmidt,
	Ingo Molnar, Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter,
	Chris Wright, bpicco, KOSAKI Motohiro, Balbir Singh,
	Michael S. Tsirkin, Peter Zijlstra, Johannes Weiner, Chris Mason,
	Borislav Petkov, Daisuke Nishimura

Hi.

Using Transparent Hugepage under memcg have some problems, mainly because
when a hugepage is being split, pc->mem_cgroup and pc->flags of tail pages
are not set. As a result, those pages are not linked to memcg's LRU, while
the usage remains as it is. This causes many problems: for example, the directory
can never be rmdir'ed, because the memcg's LRU of the group doesn't contain enough
pages to decrease the usage to 0.

These are trial patches to fix the problem(based on THP-25).

[1/2] is a simple bug fix, and can be folded into "memcg compound(commit d16259c1
at the http://git.kernel.org/?p=linux/kernel/git/andrea/aa.git)".
[2/2] is a main patch.

Unfortunately, there seems to be some problems left, so I'm digging it and
need more tests.
Any comments are welcome.


Thanks,
Daisuke Nishimura.

On Fri, 21 May 2010 02:05:40 +0200, Andrea Arcangeli <aarcange@redhat.com> wrote:
> If you're running scientific applications, JVM or large gcc builds
> (see attached patch for gcc), and you want to run from 2.5% faster for
> kernel build (on bare metal), or 8% faster in translate.o of qemu (on
> bare metal), 15% faster or more with virt and Intel EPT/ AMD NPT
> (depending on the workload), you should apply and run the transparent
> hugepage support on your systems.
> 
> Awesome results have already been posted on lkml, if you test and
> benchmark it, please provide any positive/negative real-life result on
> lkml (or privately to me if you prefer). The more testing the better.
> 
> By running your scientific apps up to ~10% faster (or more if you use
> virt), and in turn by boosting the performance of the virtualized
> cloud, you will save energy. NOTE: it can cost memory in some cases,
> but this is why a madvise(MADV_HUGEPAGE) exists, so THP can be
> selectively enabled on the regions where the app knows there will be
> zero memory waste in boosting performance (like KVM).
> 
> If you have more memory than you need as filesystem cache you can
> choose "always" mode, while if you're ram constrained or you need as
> much filesystem cache as possible but you still want a CPU boost in
> the madvise regions without risking reducing the cache you should
> choose "madvise". All settings can be later tuned with sysfs after
> boot in /sys/kernel/mm/transparent_hugepage/ . You can monitor the
> THP utilization system-wide with "grep Anon /proc/meminfo".
> 
> http://git.kernel.org/?p=linux/kernel/git/andrea/aa.git;a=shortlog
> http://git.kernel.org/?p=linux/kernel/git/andrea/aa.git;a=shortlog;h=refs/heads/anon_vma_chain
> 
> first: git clone git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git
> or first: git clone --reference linux-2.6 git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git
> later: git fetch; git checkout -f origin/master
> or to run the new anon_vma_chain: git fetch; git checkout -f origin/anon_vma_chain
> 
> I am currently running the origin/anon_vma_chain branch on all my
> systems here (keeping master around only in case of troubles with the
> new anon-vma code).
> 
> The tree is rebased and git pull won't work.
> 
> http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.34/transparent_hugepage-25/
> http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.34/transparent_hugepage-25.gz
> http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.34/transparent_hugepage-25-anon_vma_chain.gz
> 
> Diff #24 -> #25:
> 
>  b/exec-migrate-race-anon_vma-chain                       |  198 ++++++---------
> 
> Return to the cleaner fix that really allows the rmap_walk to succeed
> at all times (and it also allows migrate and split_huge_page at all
> times) without modifying the rmap_walk for this corner in execve. This
> is also more robust for the long term in case the user stack starts
> huge and we teach mremap to migrate without splitting hugepages (the
> stack may have to be splitted by some other operation in the
> VM).
> 
>  b/gfp_nomemalloc_wait                                    |   19 -
> 
> Fix: still clear ALLOC_CPUSET if the allocation is atomic.
> 
>  b/memory-compaction-anon-migrate-anon-vma-chain          |   49 +++
>  b/memory-compaction-anon-vma-refcount-anon-vma-chain     |  161 ++++++++----
>  b/memory-compaction-migrate-swapcache-anon-vma-chain     |  107 ++++++++
>  memory-compaction-anon-vma-share-refcount-anon-vma-chain |  166 ------------
> 
> Fix: the anon_vma_chain branch must use drop_anon_vma to be safe with
> the anon_vma->root->lock and avoid leaking root anon_vmas.
> 
>  b/pte_alloc_trans_splitting                              |   13 
> 
> use pmd_none instead of pmd_present in pte_alloc_map to be consistent
> with __pte_alloc (pmd_none shall be a bit faster too, and it's
> stricter too).
> 
>  b/transparent_hugepage                                   |   63 +++-
> 
> Race fix in initial huge pmd page fault (virtio-blk+THP was crashing
> the host by running "cp /dev/vda /dev/null" in guest, with a 6G ram
> guest and 4G ram + 4G swap host, immediately after host started
> swapping). I never reproduced it with any other workload apparently so
> it went unnoticed for a while (using the default ide emulation instead
> of virtio-blk also didn't show any problem at all probably because of
> different threading model or different timings). But it's not fixed.
> 
>  b/root_anon_vma-mm_take_all_locks                        |   81 ++++++
> 
> Prevent deadlock in root-anon-vma locking when registering in mmu
> notifier (i.e. starting kvm, but it only has been triggering with
> the -daemonize param for some reason, so it was unnoticed before as I
> normally run kvm in the foreground).
> 
> Diffstat:
> 
>  Documentation/cgroups/memory.txt      |    4 
>  Documentation/sysctl/vm.txt           |   25 
>  Documentation/vm/transhuge.txt        |  283 ++++
>  arch/alpha/include/asm/mman.h         |    2 
>  arch/mips/include/asm/mman.h          |    2 
>  arch/parisc/include/asm/mman.h        |    2 
>  arch/powerpc/mm/gup.c                 |   12 
>  arch/x86/include/asm/paravirt.h       |   23 
>  arch/x86/include/asm/paravirt_types.h |    6 
>  arch/x86/include/asm/pgtable-2level.h |    9 
>  arch/x86/include/asm/pgtable-3level.h |   23 
>  arch/x86/include/asm/pgtable.h        |  144 ++
>  arch/x86/include/asm/pgtable_64.h     |   14 
>  arch/x86/include/asm/pgtable_types.h  |    3 
>  arch/x86/kernel/paravirt.c            |    3 
>  arch/x86/kernel/vm86_32.c             |    1 
>  arch/x86/kvm/mmu.c                    |   26 
>  arch/x86/kvm/paging_tmpl.h            |    4 
>  arch/x86/mm/gup.c                     |   25 
>  arch/x86/mm/pgtable.c                 |   66 +
>  arch/xtensa/include/asm/mman.h        |    2 
>  drivers/base/node.c                   |    3 
>  fs/Kconfig                            |    2 
>  fs/exec.c                             |   37 
>  fs/proc/meminfo.c                     |   14 
>  fs/proc/page.c                        |   14 
>  include/asm-generic/mman-common.h     |    2 
>  include/asm-generic/pgtable.h         |  130 ++
>  include/linux/compaction.h            |   89 +
>  include/linux/gfp.h                   |   14 
>  include/linux/huge_mm.h               |  143 ++
>  include/linux/khugepaged.h            |   66 +
>  include/linux/kvm_host.h              |    4 
>  include/linux/memory_hotplug.h        |   14 
>  include/linux/migrate.h               |    2 
>  include/linux/mm.h                    |   92 +
>  include/linux/mm_inline.h             |   13 
>  include/linux/mm_types.h              |    3 
>  include/linux/mmu_notifier.h          |   40 
>  include/linux/mmzone.h                |   10 
>  include/linux/page-flags.h            |   36 
>  include/linux/rmap.h                  |   58 
>  include/linux/sched.h                 |    1 
>  include/linux/swap.h                  |    8 
>  include/linux/vmstat.h                |    4 
>  kernel/fork.c                         |   12 
>  kernel/futex.c                        |   67 -
>  kernel/sysctl.c                       |   25 
>  mm/Kconfig                            |   56 
>  mm/Makefile                           |    2 
>  mm/compaction.c                       |  620 +++++++++
>  mm/huge_memory.c                      | 2172 ++++++++++++++++++++++++++++++++++
>  mm/hugetlb.c                          |   69 -
>  mm/ksm.c                              |   77 -
>  mm/madvise.c                          |    8 
>  mm/memcontrol.c                       |   88 -
>  mm/memory-failure.c                   |    2 
>  mm/memory.c                           |  196 ++-
>  mm/memory_hotplug.c                   |   14 
>  mm/mempolicy.c                        |   14 
>  mm/migrate.c                          |   73 +
>  mm/mincore.c                          |  302 ++--
>  mm/mmap.c                             |   57 
>  mm/mprotect.c                         |   20 
>  mm/mremap.c                           |    8 
>  mm/page_alloc.c                       |  133 +-
>  mm/pagewalk.c                         |    1 
>  mm/rmap.c                             |  181 ++
>  mm/sparse.c                           |    4 
>  mm/swap.c                             |  116 +
>  mm/swap_state.c                       |    6 
>  mm/swapfile.c                         |    2 
>  mm/vmscan.c                           |   42 
>  mm/vmstat.c                           |  256 +++-
>  virt/kvm/iommu.c                      |    2 
>  virt/kvm/kvm_main.c                   |   39 
>  76 files changed, 5620 insertions(+), 522 deletions(-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC][BUGFIX][PATCH 0/2] transhuge-memcg: some fixes (Re: Transparent Hugepage Support #25)
@ 2010-06-02  5:44   ` Daisuke Nishimura
  0 siblings, 0 replies; 16+ messages in thread
From: Daisuke Nishimura @ 2010-06-02  5:44 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: linux-mm, Andrew Morton, linux-kernel, Marcelo Tosatti,
	Adam Litke, Avi Kivity, Izik Eidus, Hugh Dickins, Nick Piggin,
	Rik van Riel, Mel Gorman, Dave Hansen, Benjamin Herrenschmidt,
	Ingo Molnar, Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter,
	Chris Wright, bpicco, KOSAKI Motohiro, Balbir Singh,
	Michael S. Tsirkin, Peter Zijlstra, Johannes Weiner, Chris Mason,
	Borislav Petkov, Daisuke Nishimura

Hi.

Using Transparent Hugepage under memcg have some problems, mainly because
when a hugepage is being split, pc->mem_cgroup and pc->flags of tail pages
are not set. As a result, those pages are not linked to memcg's LRU, while
the usage remains as it is. This causes many problems: for example, the directory
can never be rmdir'ed, because the memcg's LRU of the group doesn't contain enough
pages to decrease the usage to 0.

These are trial patches to fix the problem(based on THP-25).

[1/2] is a simple bug fix, and can be folded into "memcg compound(commit d16259c1
at the http://git.kernel.org/?p=linux/kernel/git/andrea/aa.git)".
[2/2] is a main patch.

Unfortunately, there seems to be some problems left, so I'm digging it and
need more tests.
Any comments are welcome.


Thanks,
Daisuke Nishimura.

On Fri, 21 May 2010 02:05:40 +0200, Andrea Arcangeli <aarcange@redhat.com> wrote:
> If you're running scientific applications, JVM or large gcc builds
> (see attached patch for gcc), and you want to run from 2.5% faster for
> kernel build (on bare metal), or 8% faster in translate.o of qemu (on
> bare metal), 15% faster or more with virt and Intel EPT/ AMD NPT
> (depending on the workload), you should apply and run the transparent
> hugepage support on your systems.
> 
> Awesome results have already been posted on lkml, if you test and
> benchmark it, please provide any positive/negative real-life result on
> lkml (or privately to me if you prefer). The more testing the better.
> 
> By running your scientific apps up to ~10% faster (or more if you use
> virt), and in turn by boosting the performance of the virtualized
> cloud, you will save energy. NOTE: it can cost memory in some cases,
> but this is why a madvise(MADV_HUGEPAGE) exists, so THP can be
> selectively enabled on the regions where the app knows there will be
> zero memory waste in boosting performance (like KVM).
> 
> If you have more memory than you need as filesystem cache you can
> choose "always" mode, while if you're ram constrained or you need as
> much filesystem cache as possible but you still want a CPU boost in
> the madvise regions without risking reducing the cache you should
> choose "madvise". All settings can be later tuned with sysfs after
> boot in /sys/kernel/mm/transparent_hugepage/ . You can monitor the
> THP utilization system-wide with "grep Anon /proc/meminfo".
> 
> http://git.kernel.org/?p=linux/kernel/git/andrea/aa.git;a=shortlog
> http://git.kernel.org/?p=linux/kernel/git/andrea/aa.git;a=shortlog;h=refs/heads/anon_vma_chain
> 
> first: git clone git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git
> or first: git clone --reference linux-2.6 git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git
> later: git fetch; git checkout -f origin/master
> or to run the new anon_vma_chain: git fetch; git checkout -f origin/anon_vma_chain
> 
> I am currently running the origin/anon_vma_chain branch on all my
> systems here (keeping master around only in case of troubles with the
> new anon-vma code).
> 
> The tree is rebased and git pull won't work.
> 
> http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.34/transparent_hugepage-25/
> http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.34/transparent_hugepage-25.gz
> http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.34/transparent_hugepage-25-anon_vma_chain.gz
> 
> Diff #24 -> #25:
> 
>  b/exec-migrate-race-anon_vma-chain                       |  198 ++++++---------
> 
> Return to the cleaner fix that really allows the rmap_walk to succeed
> at all times (and it also allows migrate and split_huge_page at all
> times) without modifying the rmap_walk for this corner in execve. This
> is also more robust for the long term in case the user stack starts
> huge and we teach mremap to migrate without splitting hugepages (the
> stack may have to be splitted by some other operation in the
> VM).
> 
>  b/gfp_nomemalloc_wait                                    |   19 -
> 
> Fix: still clear ALLOC_CPUSET if the allocation is atomic.
> 
>  b/memory-compaction-anon-migrate-anon-vma-chain          |   49 +++
>  b/memory-compaction-anon-vma-refcount-anon-vma-chain     |  161 ++++++++----
>  b/memory-compaction-migrate-swapcache-anon-vma-chain     |  107 ++++++++
>  memory-compaction-anon-vma-share-refcount-anon-vma-chain |  166 ------------
> 
> Fix: the anon_vma_chain branch must use drop_anon_vma to be safe with
> the anon_vma->root->lock and avoid leaking root anon_vmas.
> 
>  b/pte_alloc_trans_splitting                              |   13 
> 
> use pmd_none instead of pmd_present in pte_alloc_map to be consistent
> with __pte_alloc (pmd_none shall be a bit faster too, and it's
> stricter too).
> 
>  b/transparent_hugepage                                   |   63 +++-
> 
> Race fix in initial huge pmd page fault (virtio-blk+THP was crashing
> the host by running "cp /dev/vda /dev/null" in guest, with a 6G ram
> guest and 4G ram + 4G swap host, immediately after host started
> swapping). I never reproduced it with any other workload apparently so
> it went unnoticed for a while (using the default ide emulation instead
> of virtio-blk also didn't show any problem at all probably because of
> different threading model or different timings). But it's not fixed.
> 
>  b/root_anon_vma-mm_take_all_locks                        |   81 ++++++
> 
> Prevent deadlock in root-anon-vma locking when registering in mmu
> notifier (i.e. starting kvm, but it only has been triggering with
> the -daemonize param for some reason, so it was unnoticed before as I
> normally run kvm in the foreground).
> 
> Diffstat:
> 
>  Documentation/cgroups/memory.txt      |    4 
>  Documentation/sysctl/vm.txt           |   25 
>  Documentation/vm/transhuge.txt        |  283 ++++
>  arch/alpha/include/asm/mman.h         |    2 
>  arch/mips/include/asm/mman.h          |    2 
>  arch/parisc/include/asm/mman.h        |    2 
>  arch/powerpc/mm/gup.c                 |   12 
>  arch/x86/include/asm/paravirt.h       |   23 
>  arch/x86/include/asm/paravirt_types.h |    6 
>  arch/x86/include/asm/pgtable-2level.h |    9 
>  arch/x86/include/asm/pgtable-3level.h |   23 
>  arch/x86/include/asm/pgtable.h        |  144 ++
>  arch/x86/include/asm/pgtable_64.h     |   14 
>  arch/x86/include/asm/pgtable_types.h  |    3 
>  arch/x86/kernel/paravirt.c            |    3 
>  arch/x86/kernel/vm86_32.c             |    1 
>  arch/x86/kvm/mmu.c                    |   26 
>  arch/x86/kvm/paging_tmpl.h            |    4 
>  arch/x86/mm/gup.c                     |   25 
>  arch/x86/mm/pgtable.c                 |   66 +
>  arch/xtensa/include/asm/mman.h        |    2 
>  drivers/base/node.c                   |    3 
>  fs/Kconfig                            |    2 
>  fs/exec.c                             |   37 
>  fs/proc/meminfo.c                     |   14 
>  fs/proc/page.c                        |   14 
>  include/asm-generic/mman-common.h     |    2 
>  include/asm-generic/pgtable.h         |  130 ++
>  include/linux/compaction.h            |   89 +
>  include/linux/gfp.h                   |   14 
>  include/linux/huge_mm.h               |  143 ++
>  include/linux/khugepaged.h            |   66 +
>  include/linux/kvm_host.h              |    4 
>  include/linux/memory_hotplug.h        |   14 
>  include/linux/migrate.h               |    2 
>  include/linux/mm.h                    |   92 +
>  include/linux/mm_inline.h             |   13 
>  include/linux/mm_types.h              |    3 
>  include/linux/mmu_notifier.h          |   40 
>  include/linux/mmzone.h                |   10 
>  include/linux/page-flags.h            |   36 
>  include/linux/rmap.h                  |   58 
>  include/linux/sched.h                 |    1 
>  include/linux/swap.h                  |    8 
>  include/linux/vmstat.h                |    4 
>  kernel/fork.c                         |   12 
>  kernel/futex.c                        |   67 -
>  kernel/sysctl.c                       |   25 
>  mm/Kconfig                            |   56 
>  mm/Makefile                           |    2 
>  mm/compaction.c                       |  620 +++++++++
>  mm/huge_memory.c                      | 2172 ++++++++++++++++++++++++++++++++++
>  mm/hugetlb.c                          |   69 -
>  mm/ksm.c                              |   77 -
>  mm/madvise.c                          |    8 
>  mm/memcontrol.c                       |   88 -
>  mm/memory-failure.c                   |    2 
>  mm/memory.c                           |  196 ++-
>  mm/memory_hotplug.c                   |   14 
>  mm/mempolicy.c                        |   14 
>  mm/migrate.c                          |   73 +
>  mm/mincore.c                          |  302 ++--
>  mm/mmap.c                             |   57 
>  mm/mprotect.c                         |   20 
>  mm/mremap.c                           |    8 
>  mm/page_alloc.c                       |  133 +-
>  mm/pagewalk.c                         |    1 
>  mm/rmap.c                             |  181 ++
>  mm/sparse.c                           |    4 
>  mm/swap.c                             |  116 +
>  mm/swap_state.c                       |    6 
>  mm/swapfile.c                         |    2 
>  mm/vmscan.c                           |   42 
>  mm/vmstat.c                           |  256 +++-
>  virt/kvm/iommu.c                      |    2 
>  virt/kvm/kvm_main.c                   |   39 
>  76 files changed, 5620 insertions(+), 522 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC][BUGFIX][PATCH 1/2] transhuge-memcg: fix for memcg compound
  2010-06-02  5:44   ` Daisuke Nishimura
@ 2010-06-02  5:45     ` Daisuke Nishimura
  -1 siblings, 0 replies; 16+ messages in thread
From: Daisuke Nishimura @ 2010-06-02  5:45 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: linux-mm, Andrew Morton, linux-kernel, Marcelo Tosatti,
	Adam Litke, Avi Kivity, Izik Eidus, Hugh Dickins, Nick Piggin,
	Rik van Riel, Mel Gorman, Dave Hansen, Benjamin Herrenschmidt,
	Ingo Molnar, Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter,
	Chris Wright, bpicco, KOSAKI Motohiro, Balbir Singh,
	Michael S. Tsirkin, Peter Zijlstra, Johannes Weiner, Chris Mason,
	Borislav Petkov, Daisuke Nishimura

We should increase/decrease css->refcnt properly in charging/uncharging compound pages.

Without this patch, a bug like below happens:

1. create a memcg directory.
2. run a program which uses enough memory to allocate them as transparent huge pages.
3. kill the program.
4. try to remove the directory, which will never finish.

Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b1ac9b1..b74bd83 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1650,8 +1650,9 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
 	}
 	if (csize > page_size)
 		refill_stock(mem, csize - page_size);
+	/* increase css->refcnt by the number of tail pages */
 	if (page_size != PAGE_SIZE)
-		__css_get(&mem->css, page_size);
+		__css_get(&mem->css, (page_size >> PAGE_SHIFT) - 1);
 done:
 	return 0;
 nomem:
@@ -2237,7 +2238,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
 	memcg_check_events(mem, page);
 	/* at swapout, this memcg will be accessed to record to swap */
 	if (ctype != MEM_CGROUP_CHARGE_TYPE_SWAPOUT)
-		css_put(&mem->css);
+		__css_put(&mem->css, page_size >> PAGE_SHIFT);
 
 	return mem;
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC][BUGFIX][PATCH 1/2] transhuge-memcg: fix for memcg compound
@ 2010-06-02  5:45     ` Daisuke Nishimura
  0 siblings, 0 replies; 16+ messages in thread
From: Daisuke Nishimura @ 2010-06-02  5:45 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: linux-mm, Andrew Morton, linux-kernel, Marcelo Tosatti,
	Adam Litke, Avi Kivity, Izik Eidus, Hugh Dickins, Nick Piggin,
	Rik van Riel, Mel Gorman, Dave Hansen, Benjamin Herrenschmidt,
	Ingo Molnar, Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter,
	Chris Wright, bpicco, KOSAKI Motohiro, Balbir Singh,
	Michael S. Tsirkin, Peter Zijlstra, Johannes Weiner, Chris Mason,
	Borislav Petkov, Daisuke Nishimura

We should increase/decrease css->refcnt properly in charging/uncharging compound pages.

Without this patch, a bug like below happens:

1. create a memcg directory.
2. run a program which uses enough memory to allocate them as transparent huge pages.
3. kill the program.
4. try to remove the directory, which will never finish.

Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b1ac9b1..b74bd83 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1650,8 +1650,9 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
 	}
 	if (csize > page_size)
 		refill_stock(mem, csize - page_size);
+	/* increase css->refcnt by the number of tail pages */
 	if (page_size != PAGE_SIZE)
-		__css_get(&mem->css, page_size);
+		__css_get(&mem->css, (page_size >> PAGE_SHIFT) - 1);
 done:
 	return 0;
 nomem:
@@ -2237,7 +2238,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
 	memcg_check_events(mem, page);
 	/* at swapout, this memcg will be accessed to record to swap */
 	if (ctype != MEM_CGROUP_CHARGE_TYPE_SWAPOUT)
-		css_put(&mem->css);
+		__css_put(&mem->css, page_size >> PAGE_SHIFT);
 
 	return mem;
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC][BUGFIX][PATCH 2/2] transhuge-memcg: commit tail pages at charge
  2010-06-02  5:44   ` Daisuke Nishimura
@ 2010-06-02  5:46     ` Daisuke Nishimura
  -1 siblings, 0 replies; 16+ messages in thread
From: Daisuke Nishimura @ 2010-06-02  5:46 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: linux-mm, Andrew Morton, linux-kernel, Marcelo Tosatti,
	Adam Litke, Avi Kivity, Izik Eidus, Hugh Dickins, Nick Piggin,
	Rik van Riel, Mel Gorman, Dave Hansen, Benjamin Herrenschmidt,
	Ingo Molnar, Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter,
	Chris Wright, bpicco, KOSAKI Motohiro, Balbir Singh,
	Michael S. Tsirkin, Peter Zijlstra, Johannes Weiner, Chris Mason,
	Borislav Petkov, Daisuke Nishimura

By this patch, when a transparent hugepage is charged, not only the head page but
also all the tail pages are committed, IOW pc->mem_cgroup and pc->flags of tail
pages are set.

Without this patch:

- Tail pages are not linked to any memcg's LRU at splitting. This causes many
  problems, for example, the charged memcg's directory can never be rmdir'ed
  because it doesn't have enough pages to scan to make the usage decrease to 0.
- "rss" field in memory.stat would be incorrect. Moreover, usage_in_bytes in
  root cgroup is calculated by the stat not by res_counter(since 2.6.32),
  it would be incorrect too.

Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b74bd83..708961a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1739,23 +1739,10 @@ struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page)
  * commit a charge got by __mem_cgroup_try_charge() and makes page_cgroup to be
  * USED state. If already USED, uncharge and return.
  */
-
-static void __mem_cgroup_commit_charge(struct mem_cgroup *mem,
-				       struct page_cgroup *pc,
-				       enum charge_type ctype,
-				       int page_size)
+static void ____mem_cgroup_commit_charge(struct mem_cgroup *mem,
+					 struct page_cgroup *pc,
+					 enum charge_type ctype)
 {
-	/* try_charge() can return NULL to *memcg, taking care of it. */
-	if (!mem)
-		return;
-
-	lock_page_cgroup(pc);
-	if (unlikely(PageCgroupUsed(pc))) {
-		unlock_page_cgroup(pc);
-		mem_cgroup_cancel_charge(mem, page_size);
-		return;
-	}
-
 	pc->mem_cgroup = mem;
 	/*
 	 * We access a page_cgroup asynchronously without lock_page_cgroup().
@@ -1780,6 +1767,33 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *mem,
 	}
 
 	mem_cgroup_charge_statistics(mem, pc, true);
+}
+
+static void __mem_cgroup_commit_charge(struct mem_cgroup *mem,
+				       struct page_cgroup *pc,
+				       enum charge_type ctype,
+				       int page_size)
+{
+	int i;
+	int count = page_size >> PAGE_SHIFT;
+
+	/* try_charge() can return NULL to *memcg, taking care of it. */
+	if (!mem)
+		return;
+
+	lock_page_cgroup(pc);
+	if (unlikely(PageCgroupUsed(pc))) {
+		unlock_page_cgroup(pc);
+		mem_cgroup_cancel_charge(mem, page_size);
+		return;
+	}
+
+	/*
+	 * we don't need page_cgroup_lock about tail pages, becase they are not
+	 * accessed by any other context at this point.
+	 */
+	for (i = 0; i < count; i++)
+		____mem_cgroup_commit_charge(mem, pc + i, ctype);
 
 	unlock_page_cgroup(pc);
 	/*
@@ -2173,6 +2187,8 @@ direct_uncharge:
 static struct mem_cgroup *
 __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
 {
+	int i;
+	int count;
 	struct page_cgroup *pc;
 	struct mem_cgroup *mem = NULL;
 	struct mem_cgroup_per_zone *mz;
@@ -2187,6 +2203,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
 	if (PageSwapCache(page))
 		return NULL;
 
+	count = page_size >> PAGE_SHIFT;
 	/*
 	 * Check if our page_cgroup is valid
 	 */
@@ -2222,7 +2239,8 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
 		__do_uncharge(mem, ctype, page_size);
 	if (ctype == MEM_CGROUP_CHARGE_TYPE_SWAPOUT)
 		mem_cgroup_swap_statistics(mem, true);
-	mem_cgroup_charge_statistics(mem, pc, false);
+	for (i = 0; i < count; i++)
+		mem_cgroup_charge_statistics(mem, pc + i, false);
 
 	ClearPageCgroupUsed(pc);
 	/*
@@ -2238,7 +2256,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
 	memcg_check_events(mem, page);
 	/* at swapout, this memcg will be accessed to record to swap */
 	if (ctype != MEM_CGROUP_CHARGE_TYPE_SWAPOUT)
-		__css_put(&mem->css, page_size >> PAGE_SHIFT);
+		__css_put(&mem->css, count);
 
 	return mem;
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC][BUGFIX][PATCH 2/2] transhuge-memcg: commit tail pages at charge
@ 2010-06-02  5:46     ` Daisuke Nishimura
  0 siblings, 0 replies; 16+ messages in thread
From: Daisuke Nishimura @ 2010-06-02  5:46 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: linux-mm, Andrew Morton, linux-kernel, Marcelo Tosatti,
	Adam Litke, Avi Kivity, Izik Eidus, Hugh Dickins, Nick Piggin,
	Rik van Riel, Mel Gorman, Dave Hansen, Benjamin Herrenschmidt,
	Ingo Molnar, Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter,
	Chris Wright, bpicco, KOSAKI Motohiro, Balbir Singh,
	Michael S. Tsirkin, Peter Zijlstra, Johannes Weiner, Chris Mason,
	Borislav Petkov, Daisuke Nishimura

By this patch, when a transparent hugepage is charged, not only the head page but
also all the tail pages are committed, IOW pc->mem_cgroup and pc->flags of tail
pages are set.

Without this patch:

- Tail pages are not linked to any memcg's LRU at splitting. This causes many
  problems, for example, the charged memcg's directory can never be rmdir'ed
  because it doesn't have enough pages to scan to make the usage decrease to 0.
- "rss" field in memory.stat would be incorrect. Moreover, usage_in_bytes in
  root cgroup is calculated by the stat not by res_counter(since 2.6.32),
  it would be incorrect too.

Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b74bd83..708961a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1739,23 +1739,10 @@ struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page)
  * commit a charge got by __mem_cgroup_try_charge() and makes page_cgroup to be
  * USED state. If already USED, uncharge and return.
  */
-
-static void __mem_cgroup_commit_charge(struct mem_cgroup *mem,
-				       struct page_cgroup *pc,
-				       enum charge_type ctype,
-				       int page_size)
+static void ____mem_cgroup_commit_charge(struct mem_cgroup *mem,
+					 struct page_cgroup *pc,
+					 enum charge_type ctype)
 {
-	/* try_charge() can return NULL to *memcg, taking care of it. */
-	if (!mem)
-		return;
-
-	lock_page_cgroup(pc);
-	if (unlikely(PageCgroupUsed(pc))) {
-		unlock_page_cgroup(pc);
-		mem_cgroup_cancel_charge(mem, page_size);
-		return;
-	}
-
 	pc->mem_cgroup = mem;
 	/*
 	 * We access a page_cgroup asynchronously without lock_page_cgroup().
@@ -1780,6 +1767,33 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *mem,
 	}
 
 	mem_cgroup_charge_statistics(mem, pc, true);
+}
+
+static void __mem_cgroup_commit_charge(struct mem_cgroup *mem,
+				       struct page_cgroup *pc,
+				       enum charge_type ctype,
+				       int page_size)
+{
+	int i;
+	int count = page_size >> PAGE_SHIFT;
+
+	/* try_charge() can return NULL to *memcg, taking care of it. */
+	if (!mem)
+		return;
+
+	lock_page_cgroup(pc);
+	if (unlikely(PageCgroupUsed(pc))) {
+		unlock_page_cgroup(pc);
+		mem_cgroup_cancel_charge(mem, page_size);
+		return;
+	}
+
+	/*
+	 * we don't need page_cgroup_lock about tail pages, becase they are not
+	 * accessed by any other context at this point.
+	 */
+	for (i = 0; i < count; i++)
+		____mem_cgroup_commit_charge(mem, pc + i, ctype);
 
 	unlock_page_cgroup(pc);
 	/*
@@ -2173,6 +2187,8 @@ direct_uncharge:
 static struct mem_cgroup *
 __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
 {
+	int i;
+	int count;
 	struct page_cgroup *pc;
 	struct mem_cgroup *mem = NULL;
 	struct mem_cgroup_per_zone *mz;
@@ -2187,6 +2203,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
 	if (PageSwapCache(page))
 		return NULL;
 
+	count = page_size >> PAGE_SHIFT;
 	/*
 	 * Check if our page_cgroup is valid
 	 */
@@ -2222,7 +2239,8 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
 		__do_uncharge(mem, ctype, page_size);
 	if (ctype == MEM_CGROUP_CHARGE_TYPE_SWAPOUT)
 		mem_cgroup_swap_statistics(mem, true);
-	mem_cgroup_charge_statistics(mem, pc, false);
+	for (i = 0; i < count; i++)
+		mem_cgroup_charge_statistics(mem, pc + i, false);
 
 	ClearPageCgroupUsed(pc);
 	/*
@@ -2238,7 +2256,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
 	memcg_check_events(mem, page);
 	/* at swapout, this memcg will be accessed to record to swap */
 	if (ctype != MEM_CGROUP_CHARGE_TYPE_SWAPOUT)
-		__css_put(&mem->css, page_size >> PAGE_SHIFT);
+		__css_put(&mem->css, count);
 
 	return mem;
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC][BUGFIX][PATCH 0/2] transhuge-memcg: some fixes (Re: Transparent Hugepage Support #25)
  2010-06-02  5:44   ` Daisuke Nishimura
@ 2010-06-18  1:08     ` Andrea Arcangeli
  -1 siblings, 0 replies; 16+ messages in thread
From: Andrea Arcangeli @ 2010-06-18  1:08 UTC (permalink / raw)
  To: Daisuke Nishimura
  Cc: linux-mm, Andrew Morton, linux-kernel, Marcelo Tosatti,
	Adam Litke, Avi Kivity, Izik Eidus, Hugh Dickins, Nick Piggin,
	Rik van Riel, Mel Gorman, Dave Hansen, Benjamin Herrenschmidt,
	Ingo Molnar, Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter,
	Chris Wright, bpicco, KOSAKI Motohiro, Balbir Singh,
	Michael S. Tsirkin, Peter Zijlstra, Johannes Weiner, Chris Mason,
	Borislav Petkov

On Wed, Jun 02, 2010 at 02:44:38PM +0900, Daisuke Nishimura wrote:
> These are trial patches to fix the problem(based on THP-25).
> 
> [1/2] is a simple bug fix, and can be folded into "memcg compound(commit d16259c1
> at the http://git.kernel.org/?p=linux/kernel/git/andrea/aa.git)".
> [2/2] is a main patch.
> 
> Unfortunately, there seems to be some problems left, so I'm digging it and
> need more tests.
> Any comments are welcome.

Both are included in -26, but like you said there are problems
left... are you willing to fix those too? There's some slight
difference in the code here and there that makes the fixes not so
portable across releases (uncharge as param of move_account which
wasn't there before as an example...).

Thanks a lot for the help!
Andrea

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][BUGFIX][PATCH 0/2] transhuge-memcg: some fixes (Re: Transparent Hugepage Support #25)
@ 2010-06-18  1:08     ` Andrea Arcangeli
  0 siblings, 0 replies; 16+ messages in thread
From: Andrea Arcangeli @ 2010-06-18  1:08 UTC (permalink / raw)
  To: Daisuke Nishimura
  Cc: linux-mm, Andrew Morton, linux-kernel, Marcelo Tosatti,
	Adam Litke, Avi Kivity, Izik Eidus, Hugh Dickins, Nick Piggin,
	Rik van Riel, Mel Gorman, Dave Hansen, Benjamin Herrenschmidt,
	Ingo Molnar, Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter,
	Chris Wright, bpicco, KOSAKI Motohiro, Balbir Singh,
	Michael S. Tsirkin, Peter Zijlstra, Johannes Weiner, Chris Mason,
	Borislav Petkov

On Wed, Jun 02, 2010 at 02:44:38PM +0900, Daisuke Nishimura wrote:
> These are trial patches to fix the problem(based on THP-25).
> 
> [1/2] is a simple bug fix, and can be folded into "memcg compound(commit d16259c1
> at the http://git.kernel.org/?p=linux/kernel/git/andrea/aa.git)".
> [2/2] is a main patch.
> 
> Unfortunately, there seems to be some problems left, so I'm digging it and
> need more tests.
> Any comments are welcome.

Both are included in -26, but like you said there are problems
left... are you willing to fix those too? There's some slight
difference in the code here and there that makes the fixes not so
portable across releases (uncharge as param of move_account which
wasn't there before as an example...).

Thanks a lot for the help!
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][BUGFIX][PATCH 0/2] transhuge-memcg: some fixes (Re: Transparent Hugepage Support #25)
  2010-06-18  1:08     ` Andrea Arcangeli
@ 2010-06-18  4:28       ` Daisuke Nishimura
  -1 siblings, 0 replies; 16+ messages in thread
From: Daisuke Nishimura @ 2010-06-18  4:28 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: linux-mm, Andrew Morton, linux-kernel, Marcelo Tosatti,
	Adam Litke, Avi Kivity, Izik Eidus, Nick Piggin, Rik van Riel,
	Mel Gorman, Dave Hansen, Benjamin Herrenschmidt, Ingo Molnar,
	Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter, Chris Wright,
	bpicco, KOSAKI Motohiro, Balbir Singh, Michael S. Tsirkin,
	Peter Zijlstra, Johannes Weiner, Chris Mason, Borislav Petkov,
	Hugh Dickins

On Fri, 18 Jun 2010 03:08:40 +0200, Andrea Arcangeli <aarcange@redhat.com> wrote:
> On Wed, Jun 02, 2010 at 02:44:38PM +0900, Daisuke Nishimura wrote:
> > These are trial patches to fix the problem(based on THP-25).
> > 
> > [1/2] is a simple bug fix, and can be folded into "memcg compound(commit d16259c1
> > at the http://git.kernel.org/?p=linux/kernel/git/andrea/aa.git)".
> > [2/2] is a main patch.
> > 
> > Unfortunately, there seems to be some problems left, so I'm digging it and
> > need more tests.
> > Any comments are welcome.
> 
> Both are included in -26, but like you said there are problems
> left... are you willing to fix those too?
Will do if necessary, but hmm, I heard from KAMEZAWA-san that he has already sent
some patches to fix the similar problems on RHEL6, and I prefer his fixes to mine.
Should I(or KAMEZAWA-san?) forward port his patches onto current aa.git ?

> There's some slight
> difference in the code here and there that makes the fixes not so
> portable across releases (uncharge as param of move_account which
> wasn't there before as an example...).
> 
Agreed. And I think you'll see some extra changes of memcg in 2.6.36...
Any way, I'll do some test in both RHEL6 and aa.git when I have a time,
and feel free to tell me if you have any troubles in back/forward porting
memcg's fixes.


Thanks,
Daisuke Nishimura.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][BUGFIX][PATCH 0/2] transhuge-memcg: some fixes (Re: Transparent Hugepage Support #25)
@ 2010-06-18  4:28       ` Daisuke Nishimura
  0 siblings, 0 replies; 16+ messages in thread
From: Daisuke Nishimura @ 2010-06-18  4:28 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: linux-mm, Andrew Morton, linux-kernel, Marcelo Tosatti,
	Adam Litke, Avi Kivity, Izik Eidus, Nick Piggin, Rik van Riel,
	Mel Gorman, Dave Hansen, Benjamin Herrenschmidt, Ingo Molnar,
	Mike Travis, KAMEZAWA Hiroyuki, Christoph Lameter, Chris Wright,
	bpicco, KOSAKI Motohiro, Balbir Singh, Michael S. Tsirkin,
	Peter Zijlstra, Johannes Weiner, Chris Mason, Borislav Petkov,
	Hugh Dickins

On Fri, 18 Jun 2010 03:08:40 +0200, Andrea Arcangeli <aarcange@redhat.com> wrote:
> On Wed, Jun 02, 2010 at 02:44:38PM +0900, Daisuke Nishimura wrote:
> > These are trial patches to fix the problem(based on THP-25).
> > 
> > [1/2] is a simple bug fix, and can be folded into "memcg compound(commit d16259c1
> > at the http://git.kernel.org/?p=linux/kernel/git/andrea/aa.git)".
> > [2/2] is a main patch.
> > 
> > Unfortunately, there seems to be some problems left, so I'm digging it and
> > need more tests.
> > Any comments are welcome.
> 
> Both are included in -26, but like you said there are problems
> left... are you willing to fix those too?
Will do if necessary, but hmm, I heard from KAMEZAWA-san that he has already sent
some patches to fix the similar problems on RHEL6, and I prefer his fixes to mine.
Should I(or KAMEZAWA-san?) forward port his patches onto current aa.git ?

> There's some slight
> difference in the code here and there that makes the fixes not so
> portable across releases (uncharge as param of move_account which
> wasn't there before as an example...).
> 
Agreed. And I think you'll see some extra changes of memcg in 2.6.36...
Any way, I'll do some test in both RHEL6 and aa.git when I have a time,
and feel free to tell me if you have any troubles in back/forward porting
memcg's fixes.


Thanks,
Daisuke Nishimura.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][BUGFIX][PATCH 0/2] transhuge-memcg: some fixes (Re: Transparent Hugepage Support #25)
  2010-06-18  4:28       ` Daisuke Nishimura
  (?)
@ 2010-07-09 16:48       ` Andrea Arcangeli
  -1 siblings, 0 replies; 16+ messages in thread
From: Andrea Arcangeli @ 2010-07-09 16:48 UTC (permalink / raw)
  To: Daisuke Nishimura
  Cc: linux-mm, KAMEZAWA Hiroyuki, KOSAKI Motohiro, Johannes Weiner

On Fri, Jun 18, 2010 at 01:28:17PM +0900, Daisuke Nishimura wrote:
> Will do if necessary, but hmm, I heard from KAMEZAWA-san that he has already sent
> some patches to fix the similar problems on RHEL6, and I prefer his fixes to mine.
> Should I(or KAMEZAWA-san?) forward port his patches onto current aa.git ?

Now I also got more memcg fixes from Johannes... included in -27. It'd
be nice to keep things in sync.

http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.35-rc4/transparent_hugepage-27/memcg_check_room
http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.35-rc4/transparent_hugepage-27/memcg_consume_stock
http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.35-rc4/transparent_hugepage-27/memcg_oom

> Agreed. And I think you'll see some extra changes of memcg in 2.6.36...
> Any way, I'll do some test in both RHEL6 and aa.git when I have a time,
> and feel free to tell me if you have any troubles in back/forward porting
> memcg's fixes.

I just released a new #27, please use that. Thanks!

Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2010-07-09 16:48 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-21  0:05 Transparent Hugepage Support #25 Andrea Arcangeli
2010-05-21  3:26 ` Eric Dumazet
2010-05-21  3:26   ` Eric Dumazet
2010-05-21  5:13   ` Nick Piggin
2010-05-21  5:13     ` Nick Piggin
2010-06-02  5:44 ` [RFC][BUGFIX][PATCH 0/2] transhuge-memcg: some fixes (Re: Transparent Hugepage Support #25) Daisuke Nishimura
2010-06-02  5:44   ` Daisuke Nishimura
2010-06-02  5:45   ` [RFC][BUGFIX][PATCH 1/2] transhuge-memcg: fix for memcg compound Daisuke Nishimura
2010-06-02  5:45     ` Daisuke Nishimura
2010-06-02  5:46   ` [RFC][BUGFIX][PATCH 2/2] transhuge-memcg: commit tail pages at charge Daisuke Nishimura
2010-06-02  5:46     ` Daisuke Nishimura
2010-06-18  1:08   ` [RFC][BUGFIX][PATCH 0/2] transhuge-memcg: some fixes (Re: Transparent Hugepage Support #25) Andrea Arcangeli
2010-06-18  1:08     ` Andrea Arcangeli
2010-06-18  4:28     ` Daisuke Nishimura
2010-06-18  4:28       ` Daisuke Nishimura
2010-07-09 16:48       ` Andrea Arcangeli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.