From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC892C352A3 for ; Thu, 13 Feb 2020 05:55:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8F2CC24649 for ; Thu, 13 Feb 2020 05:55:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729687AbgBMFzh (ORCPT ); Thu, 13 Feb 2020 00:55:37 -0500 Received: from zeniv.linux.org.uk ([195.92.253.2]:49902 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726368AbgBMFzg (ORCPT ); Thu, 13 Feb 2020 00:55:36 -0500 Received: from viro by ZenIV.linux.org.uk with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1j27TD-00Bp5a-8k; Thu, 13 Feb 2020 05:55:27 +0000 Date: Thu, 13 Feb 2020 05:55:27 +0000 From: Al Viro To: "Eric W. Biederman" Cc: Linus Torvalds , LKML , Kernel Hardening , Linux API , Linux FS Devel , Linux Security Module , Akinobu Mita , Alexey Dobriyan , Andrew Morton , Andy Lutomirski , Daniel Micay , Djalal Harouni , "Dmitry V . Levin" , Greg Kroah-Hartman , Ingo Molnar , "J . Bruce Fields" , Jeff Layton , Jonathan Corbet , Kees Cook , Oleg Nesterov , Solar Designer Subject: Re: [PATCH v8 07/11] proc: flush task dcache entries from all procfs instances Message-ID: <20200213055527.GS23230@ZenIV.linux.org.uk> References: <87v9obipk9.fsf@x220.int.ebiederm.org> <20200212200335.GO23230@ZenIV.linux.org.uk> <20200212203833.GQ23230@ZenIV.linux.org.uk> <20200212204124.GR23230@ZenIV.linux.org.uk> <87lfp7h422.fsf@x220.int.ebiederm.org> <87pnejf6fz.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87pnejf6fz.fsf@x220.int.ebiederm.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 12, 2020 at 10:37:52PM -0600, Eric W. Biederman wrote: > I think I have an alternate idea that could work. Add some extra code > into proc_task_readdir, that would look for dentries that no longer > point to tasks and d_invalidate them. With the same logic probably > being called from a few more places as well like proc_pid_readdir, > proc_task_lookup, and proc_pid_lookup. > > We could even optimize it and have a process died flag we set in the > superblock. > > That would would batch up the freeing work until the next time someone > reads from proc in a way that would create more dentries. So it would > prevent dentries from reaped zombies from growing without bound. > > Hmm. Given the existence of proc_fill_cache it would really be a good > idea if readdir and lookup performed some of the freeing work as well. > As on readdir we always populate the dcache for all of the directory > entries. First of all, that won't do a damn thing when nobody is accessing given superblock. What's more, readdir in root of that procfs instance is not enough - you need it in task/ of group leader. What I don't understand is the insistence on getting those dentries via dcache lookups. _IF_ we are willing to live with cacheline contention (on ->d_lock of root dentry, if nothing else), why not do the following: * put all dentries of such directories ([0-9]* and [0-9]*/task/*) into a list anchored in task_struct; have non-counting reference to task_struct stored in them (might simplify part of get_proc_task() users, BTW - avoids pid-to-task_struct lookups if we have a dentry and not just the inode; many callers do) * have ->d_release() remove from it (protecting per-task_struct lock nested outside of all ->d_lock) * on exit: lock the (per-task_struct) list while list is non-empty pick the first dentry remove from the list sb = dentry->d_sb try to bump sb->s_active (if non-zero, that is). if failed continue // move on to the next one - nothing to do here grab ->d_lock res = handle_it(dentry, &temp_list) drop ->d_lock unlock the list if (!list_empty(&temp_list)) shrink_dentry_list(&temp_list) if (res) d_invalidate(dentry) dput(dentry) deactivate_super(sb) lock the list unlock the list handle_it(dentry, temp_list) // ->d_lock held; that one should be in dcache.c if ->d_count is negative // unlikely return 0; if ->d_count is positive, increment ->d_count return 1; // OK, it's still alive, but ->d_count is 0 __d_drop // equivalent of d_invalidate in this case if not on a shrink list // otherwise it's not our headache if on lru list d_lru_del d_shrink_add dentry to temp_list return 0; And yeah, that'll dirty ->s_active for each procfs superblock that has dentry for our process present in dcache. On exit()...