From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932404AbcKHK1p (ORCPT ); Tue, 8 Nov 2016 05:27:45 -0500 Received: from mx2.suse.de ([195.135.220.15]:59172 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753533AbcKHK1T (ORCPT ); Tue, 8 Nov 2016 05:27:19 -0500 Date: Tue, 8 Nov 2016 11:27:16 +0100 From: Jan Kara To: Johannes Weiner Cc: Andrew Morton , Linus Torvalds , Jan Kara , "Kirill A. Shutemov" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH 5/6] mm: workingset: switch shadow entry tracking to radix tree exceptional counting Message-ID: <20161108102716.GL32353@quack2.suse.cz> References: <20161107190741.3619-1-hannes@cmpxchg.org> <20161107190741.3619-6-hannes@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161107190741.3619-6-hannes@cmpxchg.org> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 07-11-16 14:07:40, Johannes Weiner wrote: > Currently, we track the shadow entries in the page cache in the upper > bits of the radix_tree_node->count, behind the back of the radix tree > implementation. Because the radix tree code has no awareness of them, > we rely on random subtleties throughout the implementation (such as > the node->count != 1 check in the shrinking code which is meant to > exclude multi-entry nodes, but also happens to skip nodes with only > one shadow entry since it's accounted in the upper bits). This is > error prone and has, in fact, caused the bug fixed in d3798ae8c6f3 > ("mm: filemap: don't plant shadow entries without radix tree node"). > > To remove these subtleties, this patch moves shadow entry tracking > from the upper bits of node->count to the existing counter for > exceptional entries. node->count goes back to being a simple counter > of valid entries in the tree node and can be shrunk to a single byte. ... > diff --git a/mm/truncate.c b/mm/truncate.c > index 6ae44571d4c7..d3ce5f261f47 100644 > --- a/mm/truncate.c > +++ b/mm/truncate.c > @@ -53,7 +53,6 @@ static void clear_exceptional_entry(struct address_space *mapping, > mapping->nrexceptional--; > if (!node) > goto unlock; > - workingset_node_shadows_dec(node); > /* > * Don't track node without shadow entries. > * > @@ -61,8 +60,7 @@ static void clear_exceptional_entry(struct address_space *mapping, > * The list_empty() test is safe as node->private_list is > * protected by mapping->tree_lock. > */ > - if (!workingset_node_shadows(node) && > - !list_empty(&node->private_list)) > + if (!node->exceptional && !list_empty(&node->private_list)) > list_lru_del(&workingset_shadow_nodes, > &node->private_list); > __radix_tree_delete_node(&mapping->page_tree, node); Is this really correct now? The radix tree implementation can move a single exceptional entry at index 0 from a node into a direct pointer and free the node while it is still in the LRU list. Or am I missing something? To fix this I'd prefer to just have a callback from radix tree code when it is freeing a node, rather that trying to second-guess its implementation in the page-cache code... Otherwise the patch looks good to me and I really like the simplification! Honza -- Jan Kara SUSE Labs, CR