From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755491AbcGEQdS (ORCPT ); Tue, 5 Jul 2016 12:33:18 -0400 Received: from linuxhacker.ru ([217.76.32.60]:46412 "EHLO fiona.linuxhacker.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751489AbcGEQdP convert rfc822-to-8bit (ORCPT ); Tue, 5 Jul 2016 12:33:15 -0400 Subject: Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes. Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=us-ascii From: Oleg Drokin In-Reply-To: <20160705135149.GM14480@ZenIV.linux.org.uk> Date: Tue, 5 Jul 2016 12:33:09 -0400 Cc: Mailing List , "" Content-Transfer-Encoding: 8BIT Message-Id: <6D633478-6B94-465E-84D7-C0BA59C5E5F5@linuxhacker.ru> References: <20160617042914.GD14480@ZenIV.linux.org.uk> <20160703062917.GG14480@ZenIV.linux.org.uk> <94F1587A-7AFC-4B48-A0FC-F4CE152F18CC@linuxhacker.ru> <20160705123110.GL14480@ZenIV.linux.org.uk> <20160705135149.GM14480@ZenIV.linux.org.uk> To: Al Viro X-Mailer: Apple Mail (2.1283) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Jul 5, 2016, at 9:51 AM, Al Viro wrote: > On Tue, Jul 05, 2016 at 01:31:10PM +0100, Al Viro wrote: >> On Tue, Jul 05, 2016 at 02:22:48AM -0400, Oleg Drokin wrote: >> >>>> + if (!(open_flags & O_CREAT) && !d_unhashed(dentry)) { >> >> s/d_unhashed/d_in_lookup/ in that. >> >>> So we come racing here from multiple threads (say 3 or more - we have seen this >>> in the older crash reports, so totally possible) >>> >>>> + d_drop(dentry); >>> >>> One lucky one does this first before the others perform the !d_unhashed check above. >>> This makes the other ones to not enter here. >>> >>> And we are back to the original problem of multiple threads trying to instantiate >>> same dentry as before. >> >> Yep. See above - it should've been using d_in_lookup() in the first place, >> through the entire nfs_atomic_open(). Same in the Lustre part of fixes, >> obviously. > > See current #for-linus for hopefully fixed variants (both lustre and nfs) So at first it looked like we just need another d_init in the other arm of that "if (d_in_lookup())" statement, but alas. Also the patch that changed d_unhashed check for d_in_lookup() now results in a stale comment: /* Only hash *de if it is unhashed (new dentry). * Atomic_open may passing hashed dentries for open. */ if (d_in_lookup(*de)) { Since we no longer check for d_unhashed(), what would be a better choice of words here? "Only hash *de if it is a new dentry coming from lookup"? This also makes me question the whole thing some more. We are definitely in lookup when this hits, so the dentry is already new, yet it does not check off as d_in_lookup(). That also means that by skipping the ll_splice_alias we are failing to hash it and that causing needless lookups later? Looking some back into the history of commits, d_in_lookup() is to tell us that we are in the middle of lookup. How can we be in the middle of lookup path then and not have this set on a dentry? We know dentry was not substituted with anything here because we did not call into ll_split_alias(). So what's going on then? Here's a backtrace: [ 146.045148] [] lbug_with_loc+0x46/0xb0 [libcfs] [ 146.045158] [] ll_lookup_it_finish+0x713/0xaa0 [lustre] [ 146.045160] [] ? trace_hardirqs_on+0xd/0x10 [ 146.045167] [] ll_lookup_it+0x29b/0x710 [lustre] [ 146.045173] [] ? md_set_lock_data.part.25+0x60/0x60 [lustr e] [ 146.045179] [] ll_lookup_nd+0x84/0x190 [lustre] [ 146.045180] [] __lookup_hash+0x64/0xa0 [ 146.045181] [] ? down_write_nested+0xa8/0xc0 [ 146.045182] [] do_unlinkat+0x1bf/0x2f0 [ 146.045183] [] SyS_unlinkat+0x1b/0x30 [ 146.045185] [] entry_SYSCALL_64_fastpath+0x1f/0xbd __lookup_hash() does d_alloc (not parallel) and falls through into the ->lookup() of the filesystem. So the dots do not connect. The more I look at it the more I suspect it's wrong. Everywhere else you changed in that patch, it was in *atomic_open() with a very known impact. ll_lookup_it_finish() on the other hand is a generic lookup path, not just for atomic opens. I took out that part of your patch and problems went away it seems.