From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932336AbcKHQNO (ORCPT ); Tue, 8 Nov 2016 11:13:14 -0500 Received: from gum.cmpxchg.org ([85.214.110.215]:46794 "EHLO gum.cmpxchg.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932265AbcKHQMz (ORCPT ); Tue, 8 Nov 2016 11:12:55 -0500 Date: Tue, 8 Nov 2016 11:12:45 -0500 From: Johannes Weiner To: Jan Kara Cc: Andrew Morton , Linus Torvalds , "Kirill A. Shutemov" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH 1/6] mm: khugepaged: fix radix tree node leak in shmem collapse error path Message-ID: <20161108161245.GA4020@cmpxchg.org> References: <20161107190741.3619-1-hannes@cmpxchg.org> <20161107190741.3619-2-hannes@cmpxchg.org> <20161108095352.GH32353@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161108095352.GH32353@quack2.suse.cz> User-Agent: Mutt/1.7.1 (2016-10-04) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 08, 2016 at 10:53:52AM +0100, Jan Kara wrote: > On Mon 07-11-16 14:07:36, Johannes Weiner wrote: > > The radix tree counts valid entries in each tree node. Entries stored > > in the tree cannot be removed by simpling storing NULL in the slot or > > the internal counters will be off and the node never gets freed again. > > > > When collapsing a shmem page fails, restore the holes that were filled > > with radix_tree_insert() with a proper radix tree deletion. > > > > Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages") > > Reported-by: Jan Kara > > Signed-off-by: Johannes Weiner > > --- > > mm/khugepaged.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > index 728d7790dc2d..eac6f0580e26 100644 > > --- a/mm/khugepaged.c > > +++ b/mm/khugepaged.c > > @@ -1520,7 +1520,8 @@ static void collapse_shmem(struct mm_struct *mm, > > if (!nr_none) > > break; > > /* Put holes back where they were */ > > - radix_tree_replace_slot(slot, NULL); > > + radix_tree_delete(&mapping->page_tree, > > + iter.index); > > Hum, but this is inside radix_tree_for_each_slot() iteration. And > radix_tree_delete() may end up freeing nodes resulting in invalidating > current slot pointer and the iteration code will do use-after-free. Good point, we need to do another tree lookup after the deletion. But there are other instances in the code, where we drop the lock temporarily and somebody else could delete the node from under us. In the main collapse path, I *think* this is prevented by the fact that when we drop the tree lock we still hold the page lock of the regular page that's in the tree while we isolate and unmap it, thus pin the node. Even so, it would seem a little hairy to rely on that. Kirill? I'll update this patch and prepend another fix to the series that addresses the other two lock dropping issues. Thanks Jan. diff --git a/mm/khugepaged.c b/mm/khugepaged.c index fed8d5e96978..1e43e77a98da 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1424,6 +1424,7 @@ static void collapse_shmem(struct mm_struct *mm, radix_tree_replace_slot(&mapping->page_tree, slot, new_page + (index % HPAGE_PMD_NR)); + slot = radix_tree_iter_next(&iter); index++; continue; out_lru: @@ -1522,6 +1523,7 @@ static void collapse_shmem(struct mm_struct *mm, /* Put holes back where they were */ radix_tree_delete(&mapping->page_tree, iter.index); + slot = radix_tree_iter_next(&iter); nr_none--; continue; } @@ -1537,6 +1539,7 @@ static void collapse_shmem(struct mm_struct *mm, putback_lru_page(page); unlock_page(page); spin_lock_irq(&mapping->tree_lock); + slot = radix_tree_iter_next(&iter); } VM_BUG_ON(nr_none); spin_unlock_irq(&mapping->tree_lock);