All of lore.kernel.org
 help / color / mirror / Atom feed
* [rfc][patch] rmap: more sanity checks
@ 2007-02-14  9:04 Nick Piggin
  2007-02-14 16:50 ` Andrea Arcangeli
  2007-02-14 16:52 ` Balbir Singh
  0 siblings, 2 replies; 4+ messages in thread
From: Nick Piggin @ 2007-02-14  9:04 UTC (permalink / raw)
  To: Linux Memory Management List, Andrea Arcangeli, Petr Tesarik,
	Hugh Dickins

We have seen a bug in SLES9 that only gets picked up with Andrea's extra
rmap checks that were removed from mainline.

Petr Tesarik has got a fix for the problem, which he is planning to send
upstream. The issue is a specific condition that causes an anon page to be
incorrectly inserted into the pagetables, outside a valid vma.

It would be nice to get some of these checks back into mainline, IMO. I
wonder if I'm correct in thinking that checking the page index and mapping
is not actually racy?

Thanks,
Nick

--

Index: linux-2.6/mm/rmap.c
===================================================================
--- linux-2.6.orig/mm/rmap.c
+++ linux-2.6/mm/rmap.c
@@ -522,19 +522,49 @@ static void __page_set_anon_rmap(struct 
 }
 
 /**
+ * page_set_anon_rmap - sanity check anonymous rmap addition
+ * @page:	the page to add the mapping to
+ * @vma:	the vm area in which the mapping is added
+ * @address:	the user virtual address mapped
+ */
+static void __page_check_anon_rmap(struct page *page,
+	struct vm_area_struct *vma, unsigned long address)
+{
+	/*
+	 * The page's anon-rmap details (mapping and index) are guaranteed to
+	 * be set up correctly at this point.
+	 *
+	 * We have exclusion against page_add_anon_rmap because the caller
+	 * always holds the page locked, except if called from page_dup_rmap,
+	 * in which case the page is already known to be setup.
+	 *
+	 * We have exclusion against page_add_new_anon_rmap because those pages
+	 * are initially only visible via the pagetables, and the pte is locked
+	 * over the call to page_add_new_anon_rmap.
+	 */
+	struct anon_vma *anon_vma = vma->anon_vma;
+	anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
+	BUG_ON(page->mapping != (struct address_space *)anon_vma);
+	BUG_ON(page->index != linear_page_index(vma, address));
+}
+
+/**
  * page_add_anon_rmap - add pte mapping to an anonymous page
  * @page:	the page to add the mapping to
  * @vma:	the vm area in which the mapping is added
  * @address:	the user virtual address mapped
  *
- * The caller needs to hold the pte lock.
+ * The caller needs to hold the pte lock and the page must be locked.
  */
 void page_add_anon_rmap(struct page *page,
 	struct vm_area_struct *vma, unsigned long address)
 {
+	BUG_ON(!PageLocked(page));
+	BUG_ON(address < vma->vm_start || address >= vma->vm_end);
 	if (atomic_inc_and_test(&page->_mapcount))
 		__page_set_anon_rmap(page, vma, address);
-	/* else checking page index and mapping is racy */
+	else
+		__page_check_anon_rmap(page, vma, address);
 }
 
 /*
@@ -545,10 +575,12 @@ void page_add_anon_rmap(struct page *pag
  *
  * Same as page_add_anon_rmap but must only be called on *new* pages.
  * This means the inc-and-test can be bypassed.
+ * Page does not have to be locked.
  */
 void page_add_new_anon_rmap(struct page *page,
 	struct vm_area_struct *vma, unsigned long address)
 {
+	BUG_ON(address < vma->vm_start || address >= vma->vm_end);
 	atomic_set(&page->_mapcount, 0); /* elevate count by 1 (starts at -1) */
 	__page_set_anon_rmap(page, vma, address);
 }
@@ -566,6 +598,24 @@ void page_add_file_rmap(struct page *pag
 }
 
 /**
+ * page_dup_rmap - duplicate pte mapping to a page
+ * @page:	the page to add the mapping to
+ *
+ * For copy_page_range only: minimal extract from page_add_file_rmap /
+ * page_add_anon_rmap, avoiding unnecessary tests (already checked) so it's
+ * quicker.
+ *
+ * The caller needs to hold the pte lock.
+ */
+void page_dup_rmap(struct page *page, struct vm_area_struct *vma, unsigned long address)
+{
+	BUG_ON(page_mapcount(page) == 0);
+	if (PageAnon(page))
+		__page_check_anon_rmap(page, vma, address);
+	atomic_inc(&page->_mapcount);
+}
+
+/**
  * page_remove_rmap - take down pte mapping from a page
  * @page: page to remove mapping from
  *
Index: linux-2.6/include/linux/rmap.h
===================================================================
--- linux-2.6.orig/include/linux/rmap.h
+++ linux-2.6/include/linux/rmap.h
@@ -72,20 +72,9 @@ void __anon_vma_link(struct vm_area_stru
 void page_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long);
 void page_add_new_anon_rmap(struct page *, struct vm_area_struct *, unsigned long);
 void page_add_file_rmap(struct page *);
+void page_dup_rmap(struct page *page, struct vm_area_struct *vma, unsigned long address);
 void page_remove_rmap(struct page *, struct vm_area_struct *);
 
-/**
- * page_dup_rmap - duplicate pte mapping to a page
- * @page:	the page to add the mapping to
- *
- * For copy_page_range only: minimal extract from page_add_rmap,
- * avoiding unnecessary tests (already checked) so it's quicker.
- */
-static inline void page_dup_rmap(struct page *page)
-{
-	atomic_inc(&page->_mapcount);
-}
-
 /*
  * Called from mm/vmscan.c to handle paging out
  */
Index: linux-2.6/mm/memory.c
===================================================================
--- linux-2.6.orig/mm/memory.c
+++ linux-2.6/mm/memory.c
@@ -481,7 +481,7 @@ copy_one_pte(struct mm_struct *dst_mm, s
 	page = vm_normal_page(vma, addr, pte);
 	if (page) {
 		get_page(page);
-		page_dup_rmap(page);
+		page_dup_rmap(page, vma, addr);
 		rss[!!PageAnon(page)]++;
 	}
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [rfc][patch] rmap: more sanity checks
  2007-02-14  9:04 [rfc][patch] rmap: more sanity checks Nick Piggin
@ 2007-02-14 16:50 ` Andrea Arcangeli
  2007-02-15  0:46   ` Nick Piggin
  2007-02-14 16:52 ` Balbir Singh
  1 sibling, 1 reply; 4+ messages in thread
From: Andrea Arcangeli @ 2007-02-14 16:50 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Linux Memory Management List, Petr Tesarik, Hugh Dickins

Hi Nick,

On Wed, Feb 14, 2007 at 10:04:25AM +0100, Nick Piggin wrote:
> It would be nice to get some of these checks back into mainline, IMO. I

Obviously seconded. It's a bit ironic that my original implementation
was effectively safer that what was further "sanitized" and pushed
into mainline 8). (of course mainline over the dozen releases was
significantly improved in the locking etc.etc.. I don't mean that, but
as far as safety goes it clearly still lacks)

This isn't the first time that we catch subtle VM bugs in sles9 that
aren't reproducible in mainline (but that affects mainline too). I
tried a few times to complain about the removal of my bugchecks
(notably I recall the ones in do_no_page):

#ifndef CONFIG_DISCONTIGMEM
	/* this check is unreliable with numa enabled */
	BUG_ON(!pfn_valid(page_to_pfn(new_page)));
#endif
	pageable = !PageReserved(new_page);
	as = !!new_page->mapping;

	BUG_ON(!pageable && as);

	pageable &= as;

	/* ->nopage cannot return swapcache */
	BUG_ON(PageSwapCache(new_page));
	/* ->nopage cannot return anonymous pages */
	BUG_ON(PageAnon(new_page));

compare that with mainline...

I hope this incident will be enough to resurrect some of the
"sanitized" bugchecks.

> wonder if I'm correct in thinking that checking the page index and mapping
> is not actually racy?

Index and mapping shouldn't change if the page is locked, and you
already added the BUG_ON(!PageLocked(page)).

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [rfc][patch] rmap: more sanity checks
  2007-02-14  9:04 [rfc][patch] rmap: more sanity checks Nick Piggin
  2007-02-14 16:50 ` Andrea Arcangeli
@ 2007-02-14 16:52 ` Balbir Singh
  1 sibling, 0 replies; 4+ messages in thread
From: Balbir Singh @ 2007-02-14 16:52 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Linux Memory Management List, Andrea Arcangeli, Petr Tesarik,
	Hugh Dickins

Nick Piggin wrote:
> We have seen a bug in SLES9 that only gets picked up with Andrea's extra
> rmap checks that were removed from mainline.
> 
> Petr Tesarik has got a fix for the problem, which he is planning to send
> upstream. The issue is a specific condition that causes an anon page to be
> incorrectly inserted into the pagetables, outside a valid vma.
> 
> It would be nice to get some of these checks back into mainline, IMO. I
> wonder if I'm correct in thinking that checking the page index and mapping
> is not actually racy?
> 

I hope so, if that is indeed the case my patches for tracking and accounting
shared rss pages at

http://marc.theaimsgroup.com/?l=linux-mm&m=116738715329816&w=2

will get much simpler.

There used to be a rmap lock (PG_maplock bit) earlier to protect
rmap information

Please see

http://marc.theaimsgroup.com/?l=linux-mm&m=116738715302690&w=2

and

http://lkml.org/lkml/2004/7/12/241


	Regards,
	Balbir Singh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [rfc][patch] rmap: more sanity checks
  2007-02-14 16:50 ` Andrea Arcangeli
@ 2007-02-15  0:46   ` Nick Piggin
  0 siblings, 0 replies; 4+ messages in thread
From: Nick Piggin @ 2007-02-15  0:46 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Linux Memory Management List, Petr Tesarik, Hugh Dickins

On Wed, Feb 14, 2007 at 05:50:47PM +0100, Andrea Arcangeli wrote:
> Hi Nick,
> 
> On Wed, Feb 14, 2007 at 10:04:25AM +0100, Nick Piggin wrote:
> > It would be nice to get some of these checks back into mainline, IMO. I
> 
> Obviously seconded. It's a bit ironic that my original implementation
> was effectively safer that what was further "sanitized" and pushed
> into mainline 8). (of course mainline over the dozen releases was
> significantly improved in the locking etc.etc.. I don't mean that, but
> as far as safety goes it clearly still lacks)

Yeah it is suboptimal to have mainline and possibly some other distros
missing out and getting silent corruption.

> This isn't the first time that we catch subtle VM bugs in sles9 that
> aren't reproducible in mainline (but that affects mainline too). I
> tried a few times to complain about the removal of my bugchecks
> (notably I recall the ones in do_no_page):
> 
> #ifndef CONFIG_DISCONTIGMEM
> 	/* this check is unreliable with numa enabled */
> 	BUG_ON(!pfn_valid(page_to_pfn(new_page)));
> #endif
> 	pageable = !PageReserved(new_page);
> 	as = !!new_page->mapping;
> 
> 	BUG_ON(!pageable && as);
> 
> 	pageable &= as;
> 
> 	/* ->nopage cannot return swapcache */
> 	BUG_ON(PageSwapCache(new_page));
> 	/* ->nopage cannot return anonymous pages */
> 	BUG_ON(PageAnon(new_page));
> 
> compare that with mainline...
> 
> I hope this incident will be enough to resurrect some of the
> "sanitized" bugchecks.

Well I added VM_BUG_ON so that we could keep the checks even just for
the value of commenting the code, and those who want to run without
them can choose to. However I may have put too much stuff under
CONFIG_DEBUG_VM which might discourage distros from turning it on.
I'll work on that...

> > wonder if I'm correct in thinking that checking the page index and mapping
> > is not actually racy?
> 
> Index and mapping shouldn't change if the page is locked, and you
> already added the BUG_ON(!PageLocked(page)).

I believe that's the case, yes.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-02-15  0:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-14  9:04 [rfc][patch] rmap: more sanity checks Nick Piggin
2007-02-14 16:50 ` Andrea Arcangeli
2007-02-15  0:46   ` Nick Piggin
2007-02-14 16:52 ` Balbir Singh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.