linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] hugetlb fixes
@ 2013-06-18 18:47 Joern Engel
  2013-06-18 18:47 ` [PATCH 1/2] hugetlb: properly account rss Joern Engel
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Joern Engel @ 2013-06-18 18:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm; +Cc: Andrew Morton, Joern Engel

As everyone knows, hugetlbfs sucks.  But it is also necessary for
large memory machines, so we should make it suck less.  Top of my list
were lack of rss accounting and refusing mmap with MAP_HUGETLB when
using hugetlbfs.  The latter generally created a know in every brain I
explained this to.

Test program below is failing before these two patches and passing
after.

Joern Engel (2):
  hugetlb: properly account rss
  mmap: allow MAP_HUGETLB for hugetlbfs files

 mm/hugetlb.c |    4 ++++
 mm/mmap.c    |   12 ++++++++++--
 2 files changed, 14 insertions(+), 2 deletions(-)

-- 
1.7.10.4

#define _GNU_SOURCE
#include <assert.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>

typedef unsigned long long u64;

static size_t length = 1 << 24;

static u64 read_rss(void)
{
	char buf[4096], *s = buf;
	int i, fd;
	u64 rss;

	fd = open("/proc/self/statm", O_RDONLY);
	assert(fd > 2);
	memset(buf, 0, sizeof(buf));
	read(fd, buf, sizeof(buf) - 1);
	for (i = 0; i < 1; i++)
		s = strchr(s, ' ') + 1;
	rss = strtoull(s, NULL, 10);
	return rss << 12; /* assumes 4k pagesize */
}

static void do_mmap(int fd, int extra_flags, int unmap)
{
	int *p;
	int flags = MAP_PRIVATE | MAP_POPULATE | extra_flags;
	u64 before, after;

	before = read_rss();
	p = mmap(NULL, length, PROT_READ | PROT_WRITE, flags, fd, 0);
	assert(p != MAP_FAILED ||
			!"mmap returned an unexpected error");
	after = read_rss();
	assert(llabs(after - before - length) < 0x40000 ||
			!"rss didn't grow as expected");
	if (!unmap)
		return;
	munmap(p, length);
	after = read_rss();
	assert(llabs(after - before) < 0x40000 ||
			!"rss didn't shrink as expected");
}

static int open_file(const char *path)
{
	int fd, err;

	unlink(path);
	fd = open(path, O_CREAT | O_RDWR | O_TRUNC | O_EXCL
			| O_LARGEFILE | O_CLOEXEC, 0600);
	assert(fd > 2);
	unlink(path);
	err = ftruncate(fd, length);
	assert(!err);
	return fd;
}

int main(void)
{
	int hugefd, fd;

	fd = open_file("/dev/shm/hugetlbhog");
	hugefd = open_file("/hugepages/hugetlbhog");

	system("echo 100 > /proc/sys/vm/nr_hugepages");
	do_mmap(-1, MAP_ANONYMOUS, 1);
	do_mmap(fd, 0, 1);
	do_mmap(-1, MAP_ANONYMOUS | MAP_HUGETLB, 1);
	do_mmap(hugefd, 0, 1);
	do_mmap(hugefd, MAP_HUGETLB, 1);
	/* Leak the last one to test do_exit() */
	do_mmap(-1, MAP_ANONYMOUS | MAP_HUGETLB, 0);
	printf("oll korrekt.\n");
	return 0;
}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] hugetlb: properly account rss
  2013-06-18 18:47 [PATCH 0/2] hugetlb fixes Joern Engel
@ 2013-06-18 18:47 ` Joern Engel
  2013-06-28 22:03   ` Andrew Morton
  2013-07-03 13:41   ` Steve Capper
  2013-06-18 18:47 ` [PATCH 2/2] mmap: allow MAP_HUGETLB for hugetlbfs files Joern Engel
  2013-06-18 18:50 ` [PATCH 0/2] hugetlb fixes Jörn Engel
  2 siblings, 2 replies; 13+ messages in thread
From: Joern Engel @ 2013-06-18 18:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm; +Cc: Andrew Morton, Joern Engel

When moving a program from mmap'ing small pages to mmap'ing huge pages,
a remarkable drop in rss ensues.  For some reason hugepages were never
accounted for in rss, which in my book is a clear bug.  Sadly this bug
has been present in hugetlbfs since it was merged back in 2002.  There
is every chance existing programs depend on hugepages not being counted
as rss.

I think the correct solution is to fix the bug and wait for someone to
complain.  It is just as likely that noone cares - as evidenced by the
fact that noone seems to have noticed for ten years.

Signed-off-by: Joern Engel <joern@logfs.org>
---
 mm/hugetlb.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 1a12f5b..705036c 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1174,6 +1174,7 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma,
 	set_page_private(page, (unsigned long)spool);
 
 	vma_commit_reservation(h, vma, addr);
+	add_mm_counter(vma->vm_mm, MM_ANONPAGES, pages_per_huge_page(h));
 	return page;
 }
 
@@ -2406,6 +2407,9 @@ again:
 		if (pte_dirty(pte))
 			set_page_dirty(page);
 
+		/* -pages_per_huge_page(h) wouldn't get sign-extended */
+		add_mm_counter(vma->vm_mm, MM_ANONPAGES, -1 << h->order);
+
 		page_remove_rmap(page);
 		force_flush = !__tlb_remove_page(tlb, page);
 		if (force_flush)
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] mmap: allow MAP_HUGETLB for hugetlbfs files
  2013-06-18 18:47 [PATCH 0/2] hugetlb fixes Joern Engel
  2013-06-18 18:47 ` [PATCH 1/2] hugetlb: properly account rss Joern Engel
@ 2013-06-18 18:47 ` Joern Engel
  2013-06-19  1:22   ` Jianguo Wu
  2013-06-18 18:50 ` [PATCH 0/2] hugetlb fixes Jörn Engel
  2 siblings, 1 reply; 13+ messages in thread
From: Joern Engel @ 2013-06-18 18:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm; +Cc: Andrew Morton, Joern Engel

It is counterintuitive at best that mmap'ing a hugetlbfs file with
MAP_HUGETLB fails, while mmap'ing it without will a) succeed and b)
return huge pages.

Signed-off-by: Joern Engel <joern@logfs.org>
---
 mm/mmap.c |   12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 2a594246..76eb6df 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -33,6 +33,7 @@
 #include <linux/uprobes.h>
 #include <linux/rbtree_augmented.h>
 #include <linux/sched/sysctl.h>
+#include <linux/magic.h>
 
 #include <asm/uaccess.h>
 #include <asm/cacheflush.h>
@@ -1313,6 +1314,11 @@ unsigned long do_mmap_pgoff(struct file *file, unsigned long addr,
 	return addr;
 }
 
+static inline int is_hugetlb_file(struct file *file)
+{
+	return file->f_inode->i_sb->s_magic == HUGETLBFS_MAGIC;
+}
+
 SYSCALL_DEFINE6(mmap_pgoff, unsigned long, addr, unsigned long, len,
 		unsigned long, prot, unsigned long, flags,
 		unsigned long, fd, unsigned long, pgoff)
@@ -1322,11 +1328,12 @@ SYSCALL_DEFINE6(mmap_pgoff, unsigned long, addr, unsigned long, len,
 
 	if (!(flags & MAP_ANONYMOUS)) {
 		audit_mmap_fd(fd, flags);
-		if (unlikely(flags & MAP_HUGETLB))
-			return -EINVAL;
 		file = fget(fd);
 		if (!file)
 			goto out;
+		retval = -EINVAL;
+		if (unlikely(flags & MAP_HUGETLB && !is_hugetlb_file(file)))
+			goto out_fput;
 	} else if (flags & MAP_HUGETLB) {
 		struct user_struct *user = NULL;
 		/*
@@ -1346,6 +1353,7 @@ SYSCALL_DEFINE6(mmap_pgoff, unsigned long, addr, unsigned long, len,
 	flags &= ~(MAP_EXECUTABLE | MAP_DENYWRITE);
 
 	retval = vm_mmap_pgoff(file, addr, len, prot, flags, pgoff);
+out_fput:
 	if (file)
 		fput(file);
 out:
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] hugetlb fixes
  2013-06-18 18:47 [PATCH 0/2] hugetlb fixes Joern Engel
  2013-06-18 18:47 ` [PATCH 1/2] hugetlb: properly account rss Joern Engel
  2013-06-18 18:47 ` [PATCH 2/2] mmap: allow MAP_HUGETLB for hugetlbfs files Joern Engel
@ 2013-06-18 18:50 ` Jörn Engel
  2013-06-18 20:27   ` Andrew Morton
  2 siblings, 1 reply; 13+ messages in thread
From: Jörn Engel @ 2013-06-18 18:50 UTC (permalink / raw)
  To: linux-kernel, linux-mm; +Cc: Andrew Morton

On Tue, 18 June 2013 14:47:03 -0400, Joern Engel wrote:
> 
> Test program below is failing before these two patches and passing
> after.

Actually, do we have a place to stuff kernel tests?  And if not,
should we have one?

Jörn

--
My second remark is that our intellectual powers are rather geared to
master static relations and that our powers to visualize processes
evolving in time are relatively poorly developed.
-- Edsger W. Dijkstra

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] hugetlb fixes
  2013-06-18 20:27   ` Andrew Morton
@ 2013-06-18 20:04     ` Jörn Engel
  0 siblings, 0 replies; 13+ messages in thread
From: Jörn Engel @ 2013-06-18 20:04 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-mm

On Tue, 18 June 2013 13:27:05 -0700, Andrew Morton wrote:
> On Tue, 18 Jun 2013 14:50:55 -0400 J__rn Engel <joern@logfs.org> wrote:
> 
> > On Tue, 18 June 2013 14:47:03 -0400, Joern Engel wrote:
> > > 
> > > Test program below is failing before these two patches and passing
> > > after.
> > 
> > Actually, do we have a place to stuff kernel tests?  And if not,
> > should we have one?
> 
> Yep, tools/testing/selftests/vm.  It's pretty simple and stupid at
> present - it anything about the framework irritates you, please fix it!

Just did. :)

Jörn

--
Maintenance in other professions and of other articles is concerned with
the return of the item to its original state; in Software, maintenance
is concerned with moving an item away from its original state.
-- Les Belady

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] hugetlb fixes
  2013-06-18 18:50 ` [PATCH 0/2] hugetlb fixes Jörn Engel
@ 2013-06-18 20:27   ` Andrew Morton
  2013-06-18 20:04     ` Jörn Engel
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2013-06-18 20:27 UTC (permalink / raw)
  To: Jörn Engel; +Cc: linux-kernel, linux-mm

On Tue, 18 Jun 2013 14:50:55 -0400 J__rn Engel <joern@logfs.org> wrote:

> On Tue, 18 June 2013 14:47:03 -0400, Joern Engel wrote:
> > 
> > Test program below is failing before these two patches and passing
> > after.
> 
> Actually, do we have a place to stuff kernel tests?  And if not,
> should we have one?

Yep, tools/testing/selftests/vm.  It's pretty simple and stupid at
present - it anything about the framework irritates you, please fix it!

General guidelines for tools/testing/selftests: the tool should execute
quickly and shouldn't break the overall selftests run at either compile
time or runtime if kernel features are absent, Kconfig is unexpected,
etc.

It's more a "place to accumulate and maintain selftest programs" than a
serious self-testing framework.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] mmap: allow MAP_HUGETLB for hugetlbfs files
  2013-06-18 18:47 ` [PATCH 2/2] mmap: allow MAP_HUGETLB for hugetlbfs files Joern Engel
@ 2013-06-19  1:22   ` Jianguo Wu
  2013-06-19 16:25     ` [PATCH] mmap: allow MAP_HUGETLB for hugetlbfs files v2 Jörn Engel
  2013-06-19 16:25     ` [PATCH 2/2] mmap: allow MAP_HUGETLB for hugetlbfs files Jörn Engel
  0 siblings, 2 replies; 13+ messages in thread
From: Jianguo Wu @ 2013-06-19  1:22 UTC (permalink / raw)
  To: Joern Engel; +Cc: linux-kernel, linux-mm, Andrew Morton

Hi Joern,

On 2013/6/19 2:47, Joern Engel wrote:

> It is counterintuitive at best that mmap'ing a hugetlbfs file with
> MAP_HUGETLB fails, while mmap'ing it without will a) succeed and b)
> return huge pages.
> 
> Signed-off-by: Joern Engel <joern@logfs.org>
> ---
>  mm/mmap.c |   12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 2a594246..76eb6df 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -33,6 +33,7 @@
>  #include <linux/uprobes.h>
>  #include <linux/rbtree_augmented.h>
>  #include <linux/sched/sysctl.h>
> +#include <linux/magic.h>
>  
>  #include <asm/uaccess.h>
>  #include <asm/cacheflush.h>
> @@ -1313,6 +1314,11 @@ unsigned long do_mmap_pgoff(struct file *file, unsigned long addr,
>  	return addr;
>  }
>  
> +static inline int is_hugetlb_file(struct file *file)
> +{
> +	return file->f_inode->i_sb->s_magic == HUGETLBFS_MAGIC;
> +}
> +
>  SYSCALL_DEFINE6(mmap_pgoff, unsigned long, addr, unsigned long, len,
>  		unsigned long, prot, unsigned long, flags,
>  		unsigned long, fd, unsigned long, pgoff)
> @@ -1322,11 +1328,12 @@ SYSCALL_DEFINE6(mmap_pgoff, unsigned long, addr, unsigned long, len,
>  
>  	if (!(flags & MAP_ANONYMOUS)) {
>  		audit_mmap_fd(fd, flags);
> -		if (unlikely(flags & MAP_HUGETLB))
> -			return -EINVAL;
>  		file = fget(fd);
>  		if (!file)
>  			goto out;
> +		retval = -EINVAL;
> +		if (unlikely(flags & MAP_HUGETLB && !is_hugetlb_file(file)))

We already have is_file_hugepages().

Thanks,
Jianguo Wu

> +			goto out_fput;
>  	} else if (flags & MAP_HUGETLB) {
>  		struct user_struct *user = NULL;
>  		/*
> @@ -1346,6 +1353,7 @@ SYSCALL_DEFINE6(mmap_pgoff, unsigned long, addr, unsigned long, len,
>  	flags &= ~(MAP_EXECUTABLE | MAP_DENYWRITE);
>  
>  	retval = vm_mmap_pgoff(file, addr, len, prot, flags, pgoff);
> +out_fput:
>  	if (file)
>  		fput(file);
>  out:




^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] mmap: allow MAP_HUGETLB for hugetlbfs files v2
  2013-06-19  1:22   ` Jianguo Wu
@ 2013-06-19 16:25     ` Jörn Engel
  2013-06-20  1:31       ` Jianguo Wu
  2013-06-19 16:25     ` [PATCH 2/2] mmap: allow MAP_HUGETLB for hugetlbfs files Jörn Engel
  1 sibling, 1 reply; 13+ messages in thread
From: Jörn Engel @ 2013-06-19 16:25 UTC (permalink / raw)
  To: Jianguo Wu; +Cc: linux-kernel, linux-mm, Andrew Morton

It is counterintuitive at best that mmap'ing a hugetlbfs file with
MAP_HUGETLB fails, while mmap'ing it without will a) succeed and b)
return huge pages.
v2: use is_file_hugepages(), as suggested by Jianguo

Signed-off-by: Joern Engel <joern@logfs.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
---
 mm/mmap.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 2a594246..cdc8e7a 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1322,11 +1322,12 @@ SYSCALL_DEFINE6(mmap_pgoff, unsigned long, addr, unsigned long, len,
 
 	if (!(flags & MAP_ANONYMOUS)) {
 		audit_mmap_fd(fd, flags);
-		if (unlikely(flags & MAP_HUGETLB))
-			return -EINVAL;
 		file = fget(fd);
 		if (!file)
 			goto out;
+		retval = -EINVAL;
+		if (unlikely(flags & MAP_HUGETLB && !is_file_hugepages(file)))
+			goto out_fput;
 	} else if (flags & MAP_HUGETLB) {
 		struct user_struct *user = NULL;
 		/*
@@ -1346,6 +1347,7 @@ SYSCALL_DEFINE6(mmap_pgoff, unsigned long, addr, unsigned long, len,
 	flags &= ~(MAP_EXECUTABLE | MAP_DENYWRITE);
 
 	retval = vm_mmap_pgoff(file, addr, len, prot, flags, pgoff);
+out_fput:
 	if (file)
 		fput(file);
 out:
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] mmap: allow MAP_HUGETLB for hugetlbfs files
  2013-06-19  1:22   ` Jianguo Wu
  2013-06-19 16:25     ` [PATCH] mmap: allow MAP_HUGETLB for hugetlbfs files v2 Jörn Engel
@ 2013-06-19 16:25     ` Jörn Engel
  1 sibling, 0 replies; 13+ messages in thread
From: Jörn Engel @ 2013-06-19 16:25 UTC (permalink / raw)
  To: Jianguo Wu; +Cc: linux-kernel, linux-mm, Andrew Morton

On Wed, 19 June 2013 09:22:49 +0800, Jianguo Wu wrote:
> 
> We already have is_file_hugepages().

Indeed.  Much nicer now.  Thanks!

Jörn

--
The grand essentials of happiness are: something to do, something to
love, and something to hope for.
-- Allan K. Chalmers

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mmap: allow MAP_HUGETLB for hugetlbfs files v2
  2013-06-19 16:25     ` [PATCH] mmap: allow MAP_HUGETLB for hugetlbfs files v2 Jörn Engel
@ 2013-06-20  1:31       ` Jianguo Wu
  0 siblings, 0 replies; 13+ messages in thread
From: Jianguo Wu @ 2013-06-20  1:31 UTC (permalink / raw)
  To: Jörn Engel; +Cc: linux-kernel, linux-mm, Andrew Morton

Hi Joern,

I cannot apply this patch to Linus tree, as the code has been modified since
commit af73e4d9506d ("hugetlbfs: fix mmap failure in unaligned size request").

Thanks,
Jianguo Wu

On 2013/6/20 0:25, Jörn Engel wrote:

> It is counterintuitive at best that mmap'ing a hugetlbfs file with
> MAP_HUGETLB fails, while mmap'ing it without will a) succeed and b)
> return huge pages.
> v2: use is_file_hugepages(), as suggested by Jianguo
> 
> Signed-off-by: Joern Engel <joern@logfs.org>
> Cc: Jianguo Wu <wujianguo@huawei.com>
> ---
>  mm/mmap.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 2a594246..cdc8e7a 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1322,11 +1322,12 @@ SYSCALL_DEFINE6(mmap_pgoff, unsigned long, addr, unsigned long, len,
>  
>  	if (!(flags & MAP_ANONYMOUS)) {
>  		audit_mmap_fd(fd, flags);
> -		if (unlikely(flags & MAP_HUGETLB))
> -			return -EINVAL;
>  		file = fget(fd);
>  		if (!file)
>  			goto out;
> +		retval = -EINVAL;
> +		if (unlikely(flags & MAP_HUGETLB && !is_file_hugepages(file)))
> +			goto out_fput;
>  	} else if (flags & MAP_HUGETLB) {
>  		struct user_struct *user = NULL;
>  		/*
> @@ -1346,6 +1347,7 @@ SYSCALL_DEFINE6(mmap_pgoff, unsigned long, addr, unsigned long, len,
>  	flags &= ~(MAP_EXECUTABLE | MAP_DENYWRITE);
>  
>  	retval = vm_mmap_pgoff(file, addr, len, prot, flags, pgoff);
> +out_fput:
>  	if (file)
>  		fput(file);
>  out:




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] hugetlb: properly account rss
  2013-06-18 18:47 ` [PATCH 1/2] hugetlb: properly account rss Joern Engel
@ 2013-06-28 22:03   ` Andrew Morton
  2013-07-03 13:41   ` Steve Capper
  1 sibling, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2013-06-28 22:03 UTC (permalink / raw)
  To: Joern Engel; +Cc: linux-kernel, linux-mm

On Tue, 18 Jun 2013 14:47:04 -0400 Joern Engel <joern@logfs.org> wrote:

> When moving a program from mmap'ing small pages to mmap'ing huge pages,
> a remarkable drop in rss ensues.  For some reason hugepages were never
> accounted for in rss, which in my book is a clear bug.  Sadly this bug
> has been present in hugetlbfs since it was merged back in 2002.  There
> is every chance existing programs depend on hugepages not being counted
> as rss.
> 
> I think the correct solution is to fix the bug and wait for someone to
> complain.  It is just as likely that noone cares - as evidenced by the
> fact that noone seems to have noticed for ten years.
> 

Yes, that is a concern.  I'll toss it in there so we can see what
happens, but I fear that any problems will take a long time to be
discovered.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] hugetlb: properly account rss
  2013-06-18 18:47 ` [PATCH 1/2] hugetlb: properly account rss Joern Engel
  2013-06-28 22:03   ` Andrew Morton
@ 2013-07-03 13:41   ` Steve Capper
  2013-07-03 14:55     ` Steve Capper
  1 sibling, 1 reply; 13+ messages in thread
From: Steve Capper @ 2013-07-03 13:41 UTC (permalink / raw)
  To: Joern Engel; +Cc: linux-kernel, linux-mm, Andrew Morton

On Tue, Jun 18, 2013 at 02:47:04PM -0400, Joern Engel wrote:
> When moving a program from mmap'ing small pages to mmap'ing huge pages,
> a remarkable drop in rss ensues.  For some reason hugepages were never
> accounted for in rss, which in my book is a clear bug.  Sadly this bug
> has been present in hugetlbfs since it was merged back in 2002.  There
> is every chance existing programs depend on hugepages not being counted
> as rss.
> 
> I think the correct solution is to fix the bug and wait for someone to
> complain.  It is just as likely that noone cares - as evidenced by the
> fact that noone seems to have noticed for ten years.
> 
> Signed-off-by: Joern Engel <joern@logfs.org>
> ---
>  mm/hugetlb.c |    4 ++++
>  1 file changed, 4 insertions(+)
> 

Hi,
This patch has caused a few warnings for me today when it was integrated into
linux-next. The libhugetlbfs test suite gave me:
[ 94.320661] BUG: Bad rss-counter state mm:ffff880119461040 idx:1 val:-512
[ 94.330346] BUG: Bad rss-counter state mm:ffff880119460680 idx:1 val:-2560
[ 94.341746] BUG: Bad rss-counter state mm:ffff880119460d00 idx:1 val:-512
[ 94.347518] BUG: Bad rss-counter state mm:ffff880119460d00 idx:1 val:-512
[ 94.415203] BUG: Bad rss-counter state mm:ffff8801194f9040 idx:1 val:-1024

[ ...]

I think I've found the cause; MAP_SHARED mappings.
alloc_huge_page and __unmap_hugepage_range are called for shared pages. Also,
__unmap_hugepage_range is called more times than alloc_huge_page (which makes
sense as multiple views of a shared mapping are unmapped) leading to negative
counter values.

Excluding VM_SHARED VMAs from the counter increment/decrement stopped the
warnings for me. Although this may not be the best way to address the issue.

Cheers,
-- 
Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] hugetlb: properly account rss
  2013-07-03 13:41   ` Steve Capper
@ 2013-07-03 14:55     ` Steve Capper
  0 siblings, 0 replies; 13+ messages in thread
From: Steve Capper @ 2013-07-03 14:55 UTC (permalink / raw)
  To: Joern Engel; +Cc: linux-kernel, linux-mm, Andrew Morton

On Wed, Jul 03, 2013 at 02:41:19PM +0100, Steve Capper wrote:

[ ... ]

> Excluding VM_SHARED VMAs from the counter increment/decrement stopped the
> warnings for me.

Whoops sorry, the fork copy on write test flagged a BUG, hugetlb_cow will need
to be examined too.

Cheers,
-- 
Steve


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-07-03 14:56 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-18 18:47 [PATCH 0/2] hugetlb fixes Joern Engel
2013-06-18 18:47 ` [PATCH 1/2] hugetlb: properly account rss Joern Engel
2013-06-28 22:03   ` Andrew Morton
2013-07-03 13:41   ` Steve Capper
2013-07-03 14:55     ` Steve Capper
2013-06-18 18:47 ` [PATCH 2/2] mmap: allow MAP_HUGETLB for hugetlbfs files Joern Engel
2013-06-19  1:22   ` Jianguo Wu
2013-06-19 16:25     ` [PATCH] mmap: allow MAP_HUGETLB for hugetlbfs files v2 Jörn Engel
2013-06-20  1:31       ` Jianguo Wu
2013-06-19 16:25     ` [PATCH 2/2] mmap: allow MAP_HUGETLB for hugetlbfs files Jörn Engel
2013-06-18 18:50 ` [PATCH 0/2] hugetlb fixes Jörn Engel
2013-06-18 20:27   ` Andrew Morton
2013-06-18 20:04     ` Jörn Engel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).