From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71B90C49ED9 for ; Tue, 10 Sep 2019 22:17:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4DDD02171F for ; Tue, 10 Sep 2019 22:17:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725957AbfIJWRt (ORCPT ); Tue, 10 Sep 2019 18:17:49 -0400 Received: from zeniv.linux.org.uk ([195.92.253.2]:53222 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725876AbfIJWRt (ORCPT ); Tue, 10 Sep 2019 18:17:49 -0400 Received: from viro by ZenIV.linux.org.uk with local (Exim 4.92.2 #3 (Red Hat Linux)) id 1i7oSI-0006Op-4O; Tue, 10 Sep 2019 22:17:46 +0000 Date: Tue, 10 Sep 2019 23:17:46 +0100 From: Al Viro To: "zhengbin (A)" Cc: jack@suse.cz, akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, "zhangyi (F)" , renxudong1@huawei.com, Hou Tao , Linus Torvalds Subject: Re: Possible FS race condition between iterate_dir and d_alloc_parallel Message-ID: <20190910221746.GJ1131@ZenIV.linux.org.uk> References: <20190903154007.GJ1131@ZenIV.linux.org.uk> <20190903154114.GK1131@ZenIV.linux.org.uk> <20190909145910.GG1131@ZenIV.linux.org.uk> <14888449-3300-756c-2029-8e494b59348b@huawei.com> <7e32cda5-dc89-719d-9651-cf2bd06ae728@huawei.com> <20190910215357.GH1131@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190910215357.GH1131@ZenIV.linux.org.uk> User-Agent: Mutt/1.12.0 (2019-05-25) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Tue, Sep 10, 2019 at 10:53:57PM +0100, Al Viro wrote: > * we might need to grab dentry reference around dir_emit() in dcache_readdir(). > As it is, devpts makes it very easy to fuck that one up. FWIW, that goes back to commit 8ead9dd54716 (devpts: more pty driver interface cleanups) three years ago. Rule of the thumb: whenever you write "no actual semantic changes" in commit message, you are summoning Murphy... > * it might make sense to turn next_positive() into "try to walk that much, > return a pinned dentry, drop the original one, report how much we'd walked". > That would allow to bring ->d_lock back and short-term it might be the best > solution. IOW, > int next_positive(parent, from, count, dentry) > grab ->d_lock > walk the list, decrementing count on hashed positive ones > if we see need_resched > break > if we hadn't reached the end, grab whatever we'd reached > drop ->d_lock > dput(*dentry) > if need_resched > schedule > *dentry = whatever we'd grabbed or NULL > return count; > > The callers would use that sucker in a loop - readdir would just need to > initialize next to NULL and do > while (next_positive(dentry, p, 1, &next), next != NULL) { > in the loop, with dput(next) in the very end. And lseek would do > to = NULL; > p = &dentry->d_subdirs; > do { > n = next_positive(dentry, p, n, &to); > if (!to) > break; > p = &to->d_child; > } while (n); > move_cursor(cursor, to ? p : NULL); > dput(to); > instead of > to = next_positive(dentry, &dentry->d_subdirs, n); > move_cursor(cursor, to ? &to->d_child : NULL); > > Longer term I would really like to get rid of ->d_lock in that thing, > but it's much too late in this cycle for that.