From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752157Ab0CQCDM (ORCPT ); Tue, 16 Mar 2010 22:03:12 -0400 Received: from fgwmail5.fujitsu.co.jp ([192.51.44.35]:41660 "EHLO fgwmail5.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751973Ab0CQCDL (ORCPT ); Tue, 16 Mar 2010 22:03:11 -0400 X-SecurityPolicyCheck-FJ: OK by FujitsuOutboundMailChecker v1.3.1 From: KOSAKI Motohiro To: Mel Gorman Subject: Re: [PATCH 02/11] mm,migration: Do not try to migrate unmapped anonymous pages Cc: kosaki.motohiro@jp.fujitsu.com, Minchan Kim , KAMEZAWA Hiroyuki , Andrew Morton , Andrea Arcangeli , Christoph Lameter , Adam Litke , Avi Kivity , David Rientjes , Rik van Riel , linux-kernel@vger.kernel.org, linux-mm@kvack.org In-Reply-To: <20100315142124.GL18274@csn.ul.ie> References: <1268657329.1889.4.camel@barrios-desktop> <20100315142124.GL18274@csn.ul.ie> Message-Id: <20100317104734.4C8E.A69D9226@jp.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.50.07 [ja] Date: Wed, 17 Mar 2010 11:03:05 +0900 (JST) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > mm,migration: Do not try to migrate unmapped anonymous pages > > rmap_walk_anon() was triggering errors in memory compaction that look like > use-after-free errors. The problem is that between the page being isolated > from the LRU and rcu_read_lock() being taken, the mapcount of the page > dropped to 0 and the anon_vma gets freed. This can happen during memory > compaction if pages being migrated belong to a process that exits before > migration completes. Hence, the use-after-free race looks like > > 1. Page isolated for migration > 2. Process exits > 3. page_mapcount(page) drops to zero so anon_vma was no longer reliable > 4. unmap_and_move() takes the rcu_lock but the anon_vma is already garbage > 4. call try_to_unmap, looks up tha anon_vma and "locks" it but the lock > is garbage. > > This patch checks the mapcount after the rcu lock is taken. If the > mapcount is zero, the anon_vma is assumed to be freed and no further > action is taken. > > Signed-off-by: Mel Gorman > Acked-by: Rik van Riel > --- > mm/migrate.c | 13 +++++++++++++ > 1 files changed, 13 insertions(+), 0 deletions(-) > > diff --git a/mm/migrate.c b/mm/migrate.c > index 98eaaf2..6eb1efe 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -603,6 +603,19 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private, > */ > if (PageAnon(page)) { > rcu_read_lock(); > + > + /* > + * If the page has no mappings any more, just bail. An > + * unmapped anon page is likely to be freed soon but worse, > + * it's possible its anon_vma disappeared between when > + * the page was isolated and when we reached here while > + * the RCU lock was not held > + */ > + if (!page_mapcount(page)) { > + rcu_read_unlock(); > + goto uncharge; > + } I haven't understand what prevent this check. Why don't we need following scenario? 1. Page isolated for migration 2. Passed this if (!page_mapcount(page)) check 3. Process exits 4. page_mapcount(page) drops to zero so anon_vma was no longer reliable Traditionally, page migration logic is, it can touch garbarge of anon_vma, but SLAB_DESTROY_BY_RCU prevent any disaster. Is this broken concept? From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail137.messagelabs.com (mail137.messagelabs.com [216.82.249.19]) by kanga.kvack.org (Postfix) with SMTP id 0F17C6B00AC for ; Tue, 16 Mar 2010 22:03:10 -0400 (EDT) Received: from m2.gw.fujitsu.co.jp ([10.0.50.72]) by fgwmail5.fujitsu.co.jp (Fujitsu Gateway) with ESMTP id o2H2382H005899 for (envelope-from kosaki.motohiro@jp.fujitsu.com); Wed, 17 Mar 2010 11:03:08 +0900 Received: from smail (m2 [127.0.0.1]) by outgoing.m2.gw.fujitsu.co.jp (Postfix) with ESMTP id D084445DE4F for ; Wed, 17 Mar 2010 11:03:07 +0900 (JST) Received: from s2.gw.fujitsu.co.jp (s2.gw.fujitsu.co.jp [10.0.50.92]) by m2.gw.fujitsu.co.jp (Postfix) with ESMTP id 980AB45DE55 for ; Wed, 17 Mar 2010 11:03:07 +0900 (JST) Received: from s2.gw.fujitsu.co.jp (localhost.localdomain [127.0.0.1]) by s2.gw.fujitsu.co.jp (Postfix) with ESMTP id 69A2F1DB803C for ; Wed, 17 Mar 2010 11:03:07 +0900 (JST) Received: from ml14.s.css.fujitsu.com (ml14.s.css.fujitsu.com [10.249.87.104]) by s2.gw.fujitsu.co.jp (Postfix) with ESMTP id 19607E38004 for ; Wed, 17 Mar 2010 11:03:07 +0900 (JST) From: KOSAKI Motohiro Subject: Re: [PATCH 02/11] mm,migration: Do not try to migrate unmapped anonymous pages In-Reply-To: <20100315142124.GL18274@csn.ul.ie> References: <1268657329.1889.4.camel@barrios-desktop> <20100315142124.GL18274@csn.ul.ie> Message-Id: <20100317104734.4C8E.A69D9226@jp.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Date: Wed, 17 Mar 2010 11:03:05 +0900 (JST) Sender: owner-linux-mm@kvack.org To: Mel Gorman Cc: kosaki.motohiro@jp.fujitsu.com, Minchan Kim , KAMEZAWA Hiroyuki , Andrew Morton , Andrea Arcangeli , Christoph Lameter , Adam Litke , Avi Kivity , David Rientjes , Rik van Riel , linux-kernel@vger.kernel.org, linux-mm@kvack.org List-ID: > mm,migration: Do not try to migrate unmapped anonymous pages > > rmap_walk_anon() was triggering errors in memory compaction that look like > use-after-free errors. The problem is that between the page being isolated > from the LRU and rcu_read_lock() being taken, the mapcount of the page > dropped to 0 and the anon_vma gets freed. This can happen during memory > compaction if pages being migrated belong to a process that exits before > migration completes. Hence, the use-after-free race looks like > > 1. Page isolated for migration > 2. Process exits > 3. page_mapcount(page) drops to zero so anon_vma was no longer reliable > 4. unmap_and_move() takes the rcu_lock but the anon_vma is already garbage > 4. call try_to_unmap, looks up tha anon_vma and "locks" it but the lock > is garbage. > > This patch checks the mapcount after the rcu lock is taken. If the > mapcount is zero, the anon_vma is assumed to be freed and no further > action is taken. > > Signed-off-by: Mel Gorman > Acked-by: Rik van Riel > --- > mm/migrate.c | 13 +++++++++++++ > 1 files changed, 13 insertions(+), 0 deletions(-) > > diff --git a/mm/migrate.c b/mm/migrate.c > index 98eaaf2..6eb1efe 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -603,6 +603,19 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private, > */ > if (PageAnon(page)) { > rcu_read_lock(); > + > + /* > + * If the page has no mappings any more, just bail. An > + * unmapped anon page is likely to be freed soon but worse, > + * it's possible its anon_vma disappeared between when > + * the page was isolated and when we reached here while > + * the RCU lock was not held > + */ > + if (!page_mapcount(page)) { > + rcu_read_unlock(); > + goto uncharge; > + } I haven't understand what prevent this check. Why don't we need following scenario? 1. Page isolated for migration 2. Passed this if (!page_mapcount(page)) check 3. Process exits 4. page_mapcount(page) drops to zero so anon_vma was no longer reliable Traditionally, page migration logic is, it can touch garbarge of anon_vma, but SLAB_DESTROY_BY_RCU prevent any disaster. Is this broken concept? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org