From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755267AbcGEPVq (ORCPT ); Tue, 5 Jul 2016 11:21:46 -0400 Received: from linuxhacker.ru ([217.76.32.60]:42226 "EHLO fiona.linuxhacker.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751544AbcGEPVl convert rfc822-to-8bit (ORCPT ); Tue, 5 Jul 2016 11:21:41 -0400 Subject: Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes. Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=us-ascii From: Oleg Drokin In-Reply-To: <20160705135149.GM14480@ZenIV.linux.org.uk> Date: Tue, 5 Jul 2016 11:21:32 -0400 Cc: Mailing List , "" Content-Transfer-Encoding: 8BIT Message-Id: <73CF0170-DE2B-4335-91EE-D7EE41069BFA@linuxhacker.ru> References: <20160617042914.GD14480@ZenIV.linux.org.uk> <20160703062917.GG14480@ZenIV.linux.org.uk> <94F1587A-7AFC-4B48-A0FC-F4CE152F18CC@linuxhacker.ru> <20160705123110.GL14480@ZenIV.linux.org.uk> <20160705135149.GM14480@ZenIV.linux.org.uk> To: Al Viro X-Mailer: Apple Mail (2.1283) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Jul 5, 2016, at 9:51 AM, Al Viro wrote: > On Tue, Jul 05, 2016 at 01:31:10PM +0100, Al Viro wrote: >> On Tue, Jul 05, 2016 at 02:22:48AM -0400, Oleg Drokin wrote: >> >>>> + if (!(open_flags & O_CREAT) && !d_unhashed(dentry)) { >> >> s/d_unhashed/d_in_lookup/ in that. >> >>> So we come racing here from multiple threads (say 3 or more - we have seen this >>> in the older crash reports, so totally possible) >>> >>>> + d_drop(dentry); >>> >>> One lucky one does this first before the others perform the !d_unhashed check above. >>> This makes the other ones to not enter here. >>> >>> And we are back to the original problem of multiple threads trying to instantiate >>> same dentry as before. >> >> Yep. See above - it should've been using d_in_lookup() in the first place, >> through the entire nfs_atomic_open(). Same in the Lustre part of fixes, >> obviously. > > See current #for-linus for hopefully fixed variants (both lustre and nfs) The first patch of the series: > @@ -416,9 +416,9 @@ static int ll_lookup_it_finish(struct ptlrpc_request *request, > ... > - if (d_unhashed(*de)) { > + if (d_in_lookup(*de)) { > struct dentry *alias; > > alias = ll_splice_alias(inode, *de); This breaks Lustre because we now might progress further in this function without calling into ll_splice_alias and that's the only place that we do ll_d_init() that later code depends on so we violently crash next time we call e.g. d_lustre_revalidate() further down that code. Also I still wonder what's to stop d_alloc_parallel() from returning a hashed dentry with d_in_lookup() still true? Certainly there's a big gap between hashing the dentry and dropping the PAR bit in there that I imagine might allow __d_lookup_rcu() to pick it up in between?