linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Peter Zijlstra <peterz@infradead.org>
Cc: r6144 <rainy6144@gmail.com>,
	linux-kernel@vger.kernel.org, Darren Hart <dvhltc@us.ibm.com>,
	tglx <tglx@linutronix.de>, Andrea Arcangeli <aarcange@redhat.com>,
	Lee Schermerhorn <lee.schermerhorn@hp.com>
Subject: Re: Process-shared futexes on hugepages puts the kernel in an infinite loop in 2.6.32.11; is this fixed now?
Date: Mon, 19 Apr 2010 17:34:48 +0100	[thread overview]
Message-ID: <20100419163448.GY19264@csn.ul.ie> (raw)
In-Reply-To: <1271691905.1488.317.camel@laptop>

On Mon, Apr 19, 2010 at 05:45:05PM +0200, Peter Zijlstra wrote:
> On Mon, 2010-04-19 at 16:32 +0100, Mel Gorman wrote:
> > Fix infinite loop in get_futex_key when backed by huge pages
> > 
> > If a futex key happens to be located within a huge page mapped MAP_PRIVATE,
> > get_futex_key() can go into an infinite loop waiting for a page->mapping
> > that will never exist. This was reported and documented in an external
> > bugzilla at
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=552257
> > 
> > This patch makes page->mapping a poisoned value that includes PAGE_MAPPING_ANON
> > mapped MAP_PRIVATE.  This is enough for futex to continue but because
> > of PAGE_MAPPING_ANON, the poisoned value is not dereferenced or used by
> > futex. No other part of the VM should be dereferencing the page->mapping of
> > a hugetlbfs page as its page cache is not on the LRU.
> > 
> > This patch fixes the problem with the test case described in the bugzilla.
> > 
> > Signed-off-by: Mel Gorman <mel@csn.ul.ie>
> > ---
> >  include/linux/poison.h |   10 ++++++++++
> >  mm/hugetlb.c           |    6 +++++-
> >  2 files changed, 15 insertions(+), 1 deletions(-)
> > 
> > diff --git a/include/linux/poison.h b/include/linux/poison.h
> > index 2110a81..0f7b5ac 100644
> > --- a/include/linux/poison.h
> > +++ b/include/linux/poison.h
> > @@ -48,6 +48,16 @@
> >  #define POISON_FREE    0x6b    /* for use-after-free poisoning */
> >  #define        POISON_END      0xa5    /* end-byte of poisoning */
> >  
> > +/********** mm/hugetlb.c **********/
> > +/*
> > + * Private mappings of hugetlb pages use this poisoned value for
> > + * page->mapping. The core VM should not be doing anything with this mapping
> > + * but futex requires the existance of some page->mapping value even if it
> > + * is unused. If the core VM does deference the mapping, it'll look like a
> > + * suspiciously high null-pointer offset starting from 0x2e5
> > + */
> > +#define HUGETLB_PRIVATE_MAPPING        (0x2e4 | PAGE_MAPPING_ANON)
> 
> Wouldn't a longer poison be more recognisable? Also, shouldn't this use
> POISON_POINTER_DELTA?
> 

I was looking for an address < 0x1000 because it would only be valid in very
rare cases. I wasn't so sure about any other pointer value and only x86-64
appears to define POISON_POINTER_DELTA.

> Something like:
> 
> #define HUGETBL_POISON	((void *) 0x00300300 + POISON_POINTER_DELTA)
> 
> 0x2e5 isn't that high, I've had actual derefs in that range.
> 

So have I, but it couldn't be too near the page boundary either and pretty
much any address can be valid.  Still, architectures aren't stopped from
defining the delta and it is something we appear to rely on for the list
poisoning. I'd prefer a value below 0x1000 but matching list poisoning should
work for the most part.

How about?

==== CUT HERE ====
Fix infinite loop in get_futex_key when backed by huge pages

If a futex key happens to be located within a huge page mapped MAP_PRIVATE,
get_futex_key() can go into an infinite loop waiting for a page->mapping that
will never exist. This was reported and documented in an external bugzilla at

https://bugzilla.redhat.com/show_bug.cgi?id=552257

This patch makes page->mapping a poisoned value that includes
PAGE_MAPPING_ANON mapped MAP_PRIVATE.  This is enough for futex to continue
but because of PAGE_MAPPING_ANON, the poisoned value is not dereferenced
or used by futex. No other part of the VM should be dereferencing the
page->mapping of a hugetlbfs page as its page cache is not on the LRU.

This patch fixes the problem with the test case described in the bugzilla.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
 include/linux/poison.h |    9 +++++++++
 mm/hugetlb.c           |    5 ++++-
 2 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/include/linux/poison.h b/include/linux/poison.h
index 2110a81..bab71f3 100644
--- a/include/linux/poison.h
+++ b/include/linux/poison.h
@@ -48,6 +48,15 @@
 #define POISON_FREE	0x6b	/* for use-after-free poisoning */
 #define	POISON_END	0xa5	/* end-byte of poisoning */
 
+/********** mm/hugetlb.c **********/
+/*
+ * Private mappings of hugetlb pages use this poisoned value for
+ * page->mapping. The core VM should not be doing anything with this mapping
+ * but futex requires the existance of some page->mapping value even though it
+ * is unused if PAGE_MAPPING_ANON is set.
+ */
+#define HUGETLB_POISON	((void *)(0x00300300 + POISON_POINTER_DELTA + PAGE_MAPPING_ANON))
+
 /********** arch/$ARCH/mm/init.c **********/
 #define POISON_FREE_INITMEM	0xcc
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6034dc9..ffbdfc8 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -546,6 +546,7 @@ static void free_huge_page(struct page *page)
 
 	mapping = (struct address_space *) page_private(page);
 	set_page_private(page, 0);
+	page->mapping = NULL;
 	BUG_ON(page_count(page));
 	INIT_LIST_HEAD(&page->lru);
 
@@ -2447,8 +2448,10 @@ retry:
 			spin_lock(&inode->i_lock);
 			inode->i_blocks += blocks_per_huge_page(h);
 			spin_unlock(&inode->i_lock);
-		} else
+		} else {
 			lock_page(page);
+			page->mapping = HUGETLB_POISON;
+		}
 	}
 
 	/*

  parent reply	other threads:[~2010-04-19 16:35 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-16 15:45 Process-shared futexes on hugepages puts the kernel in an infinite loop in 2.6.32.11; is this fixed now? r6144
2010-04-16 20:27 ` Peter Zijlstra
2010-04-19 11:43   ` Mel Gorman
2010-04-19 11:52     ` Peter Zijlstra
2010-04-19 15:32       ` Mel Gorman
2010-04-19 15:45         ` Peter Zijlstra
2010-04-19 16:11           ` Andrea Arcangeli
2010-04-19 16:18             ` Peter Zijlstra
2010-04-19 16:32               ` Andrea Arcangeli
2010-04-19 16:34           ` Mel Gorman [this message]
2010-04-19 15:48         ` Andrea Arcangeli
2010-04-19 16:04         ` Darren Hart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100419163448.GY19264@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=aarcange@redhat.com \
    --cc=dvhltc@us.ibm.com \
    --cc=lee.schermerhorn@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rainy6144@gmail.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).