From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.3 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B233C2D0A3 for ; Tue, 3 Nov 2020 10:41:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 41B3820756 for ; Tue, 3 Nov 2020 10:41:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=codewreck.org header.i=@codewreck.org header.b="4lHjFmbk" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727964AbgKCKle (ORCPT ); Tue, 3 Nov 2020 05:41:34 -0500 Received: from nautica.notk.org ([91.121.71.147]:58869 "EHLO nautica.notk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726388AbgKCKld (ORCPT ); Tue, 3 Nov 2020 05:41:33 -0500 Received: by nautica.notk.org (Postfix, from userid 1001) id AAD5AC01B; Tue, 3 Nov 2020 11:41:31 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codewreck.org; s=2; t=1604400091; bh=gXMvbyFmzw+WHPPo8kBZIq+goTbj33ryWfuToPLkWdc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=4lHjFmbkIYxyZHYzzqh54nfYbvRUTMjB1H525D1XITLm35130K97AYSlDxkEQ60u/ ZmxcFuPCw6t7kt81p+QdqgdVm2ElsPz2QFGEYRuaTJbGAlLXYKXQ+ISnnS3NS0yDqa RZeic6/fT6ADxMoL3lpQm1iiZdyFAQYkaEG+R90a3m3U4o9+0Y3w0FZr7Y9jV7jva8 KCmOuh78VaBzbRNksKc2rCvvnS7qdqN4P6ZUgQ9oK1Rkzp/vwN6886Tf9CXxiVH+yN C+jP8K9i+yF5mia9ZteStcoCks0uXVNjG+rhLHNIr8VPBOn1FQSjzNEmO87BOTxNhk NTC/eActIX4Ig== Date: Tue, 3 Nov 2020 11:41:16 +0100 From: Dominique Martinet To: Jianyong Wu Cc: ericvh@gmail.com, lucho@ionkov.net, qemu_oss@crudebyte.com, groug@kaod.org, v9fs-developer@lists.sourceforge.net, linux-kernel@vger.kernel.org, justin.he@arm.com Subject: Re: [PATCH RFC v2 4/4] 9p: fix race issue in fid contention. Message-ID: <20201103104116.GA19587@nautica> References: <20200923141146.90046-1-jianyong.wu@arm.com> <20200923141146.90046-5-jianyong.wu@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200923141146.90046-5-jianyong.wu@arm.com> User-Agent: Mutt/1.5.21 (2010-09-15) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Jianyong, Jianyong Wu wrote on Wed, Sep 23, 2020: > Eric's and Greg's patch offer a mechanism to fix open-unlink-f*syscall > bug in 9p. But there is race issue in fid comtention. > As Greg's patch stores all of fids from opened files into according inode, > so all the lookup fid ops can retrieve fid from inode preferentially. But > there is no mechanism to handle the fid comtention issue. For example, > there are two threads get the same fid in the same time and one of them > clunk the fid before the other thread ready to discard the fid. In this > scenario, it will lead to some fatal problems, even kernel core dump. > > I introduce a mechanism to fix this race issue. A counter field introduced > into p9_fid struct to store the reference counter to the fid. When a fid > is allocated from the inode or dentry, the counter will increase, and > will decrease at the end of its occupation. It is guaranteed that the > fid won't be clunked before the reference counter go down to 0, then > we can avoid the clunked fid to be used. > > tests: > race issue test from the old test case: > for file in {01..50}; do touch f.${file}; done > seq 1 1000 | xargs -n 1 -P 50 -I{} cat f.* > /dev/null > > open-unlink-f*syscall test: > I have tested for f*syscall include: ftruncate fstat fchown fchmod faccessat. > > Signed-off-by: Jianyong Wu Ok so I've finally taken some time to test -- sorry for the delay. I didn't bother with qemu but the use-after-close f* calls work with nfs-ganesha and it doesn't introduce any obvious regression with the current qemu either, so this is good for me. Greg, Christian - from what I understood (in private, hopefully I'm allowed to repeat!), he won't be able to contribute to qemu because of company policies and I'm unlikely to take the time either right now. I don't think it's a problem to continue as is though, we can land linux kernel support (it's still useful for non-qemu servers) and if someone is interested later on they'll just need to finish that bit. I'm not seeing any obvious problem now so I'll take these patches in -next within the next few days, with this extra patch on top that basically applies the requests I had: - removing the extra atomic_set in fs/9p/fid.c - convert from atomic to refcount API (overflow checks) - rename a no-composant walk to clone_fid() I'll just run some more instrumented tests first to check we're not leaking anything but so far I haven't found any problem. If you noticed anything else please speak up. Thanks for taking the time to finish this! :) --- fs/9p/fid.c | 10 ++++------ fs/9p/fid.h | 2 +- include/net/9p/client.h | 2 +- net/9p/client.c | 4 ++-- 4 files changed, 8 insertions(+), 10 deletions(-) diff --git a/fs/9p/fid.c b/fs/9p/fid.c index 89643dabcdae..50118ec72a92 100644 --- a/fs/9p/fid.c +++ b/fs/9p/fid.c @@ -28,7 +28,6 @@ static inline void __add_fid(struct dentry *dentry, struct p9_fid *fid) { - atomic_set(&fid->count, 1); hlist_add_head(&fid->dlist, (struct hlist_head *)&dentry->d_fsdata); } @@ -62,7 +61,7 @@ static struct p9_fid *v9fs_fid_find_inode(struct inode *inode, kuid_t uid) } } if (ret && !IS_ERR(ret)) - atomic_inc(&ret->count); + refcount_inc(&ret->count); spin_unlock(&inode->i_lock); return ret; } @@ -77,7 +76,6 @@ static struct p9_fid *v9fs_fid_find_inode(struct inode *inode, kuid_t uid) void v9fs_open_fid_add(struct inode *inode, struct p9_fid *fid) { spin_lock(&inode->i_lock); - atomic_set(&fid->count, 1); hlist_add_head(&fid->ilist, (struct hlist_head *)&inode->i_private); spin_unlock(&inode->i_lock); } @@ -110,7 +108,7 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, kuid_t uid, int any) hlist_for_each_entry(fid, h, dlist) { if (any || uid_eq(fid->uid, uid)) { ret = fid; - atomic_inc(&ret->count); + refcount_inc(&ret->count); break; } } @@ -201,7 +199,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry, } /* If we are root ourself just return that */ if (dentry->d_sb->s_root == dentry) { - atomic_inc(&fid->count); + refcount_inc(&fid->count); return fid; } /* @@ -250,7 +248,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry, fid = ERR_PTR(-ENOENT); } else { __add_fid(dentry, fid); - atomic_inc(&fid->count); + refcount_inc(&fid->count); spin_unlock(&dentry->d_lock); } } diff --git a/fs/9p/fid.h b/fs/9p/fid.h index 1fed96546728..f7f33509e169 100644 --- a/fs/9p/fid.h +++ b/fs/9p/fid.h @@ -28,7 +28,7 @@ static inline struct p9_fid *v9fs_fid_clone(struct dentry *dentry) if (!fid || IS_ERR(fid)) return fid; - nfid = p9_client_walk(fid, 0, NULL, 1); + nfid = clone_fid(fid); p9_client_clunk(fid); return nfid; } diff --git a/include/net/9p/client.h b/include/net/9p/client.h index 58ed9bd306bd..e1c308d8d288 100644 --- a/include/net/9p/client.h +++ b/include/net/9p/client.h @@ -149,7 +149,7 @@ enum fid_source { struct p9_fid { struct p9_client *clnt; u32 fid; - atomic_t count; + refcount_t count; int mode; struct p9_qid qid; u32 iounit; diff --git a/net/9p/client.c b/net/9p/client.c index a6c8a915e0d8..ba4910138c5b 100644 --- a/net/9p/client.c +++ b/net/9p/client.c @@ -901,7 +901,7 @@ static struct p9_fid *p9_fid_create(struct p9_client *clnt) fid->clnt = clnt; fid->rdir = NULL; fid->fid = 0; - atomic_set(&fid->count, 1); + refcount_set(&fid->count, 1); idr_preload(GFP_KERNEL); spin_lock_irq(&clnt->lock); @@ -1466,7 +1466,7 @@ int p9_client_clunk(struct p9_fid *fid) dump_stack(); return 0; } - if (!atomic_dec_and_test(&fid->count)) + if (!refcount_dec_and_test(&fid->count)) return 0; again: