linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	NeilBrown <neilb@suse.de>
Cc: yangerkun <yangerkun@huawei.com>,
	kernel test robot <rong.a.chen@intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	lkp@lists.01.org, Bruce Fields <bfields@fieldses.org>,
	Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [locks] 6d390e4b5d: will-it-scale.per_process_ops -96.6% regression
Date: Tue, 10 Mar 2020 18:07:42 -0400	[thread overview]
Message-ID: <0066a9f150a55c13fcc750f6e657deae4ebdef97.camel@kernel.org> (raw)
In-Reply-To: <CAHk-=wg8N4fDRC3M21QJokoU+TQrdnv7HqoaFW-Z-ZT8z_Bi7Q@mail.gmail.com>

On Tue, 2020-03-10 at 14:47 -0700, Linus Torvalds wrote:
> On Tue, Mar 10, 2020 at 2:22 PM NeilBrown <neilb@suse.de> wrote:
> > A compiler barrier() is probably justified.  Memory barriers delay reads
> > and expedite writes so they cannot be needed.
> 
> That's not at all guaranteed. Weakly ordered memory things can
> actually have odd orderings, and not just "writes delayed, reads done
> early". Reads may be delayed too by cache misses, and memory barriers
> can thus expedite reads as well (by forcing the missing read to happen
> before later non-missing ones).
> 
> So don't assume that a memory barrier would only delay reads and
> expedite writes. Quite the reverse: assume that there is no ordering
> at all unless you impose one with a memory barrier (*).
> 
>              Linus
> 
> (*) it's a bit more complex than that, in that we do assume that
> control dependencies end up gating writes, for example, but those
> kinds of implicit ordering things should *not* be what you depend on
> in the code unless you're doing some seriously subtle memory ordering
> work and comment on it extensively.

Good point. I too prefer code that's understandable by mere mortals.

Given that, and the fact that Neil pointed out that yangerkun's latest
patch would reintroduce the original race, I'm leaning back toward the
patch Neil sent yesterday. It relies solely on spinlocks, and so doesn't
have the subtle memory-ordering requirements of the others.

I did some cursory testing with it and it seems to fix the performance
regression. If you guys are OK with this patch, and Neil can send an
updated changelog, I'll get it into -next and we can get this sorted
out.

Thanks,

-------------------8<-------------------

[PATCH] locks: reintroduce locks_delete_block shortcut
---
 fs/locks.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/fs/locks.c b/fs/locks.c
index 426b55d333d5..8aa04d5ac8b3 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -735,11 +735,13 @@ static void __locks_wake_up_blocks(struct file_lock *blocker)
 
 		waiter = list_first_entry(&blocker->fl_blocked_requests,
 					  struct file_lock, fl_blocked_member);
+		spin_lock(&waiter->fl_wait.lock);
 		__locks_delete_block(waiter);
 		if (waiter->fl_lmops && waiter->fl_lmops->lm_notify)
 			waiter->fl_lmops->lm_notify(waiter);
 		else
-			wake_up(&waiter->fl_wait);
+			wake_up_locked(&waiter->fl_wait);
+		spin_unlock(&waiter->fl_wait.lock);
 	}
 }
 
@@ -753,6 +755,31 @@ int locks_delete_block(struct file_lock *waiter)
 {
 	int status = -ENOENT;
 
+	/*
+	 * If fl_blocker is NULL, it won't be set again as this thread
+	 * "owns" the lock and is the only one that might try to claim
+	 * the lock.  So it is safe to test fl_blocker locklessly.
+	 * Also if fl_blocker is NULL, this waiter is not listed on
+	 * fl_blocked_requests for some lock, so no other request can
+	 * be added to the list of fl_blocked_requests for this
+	 * request.  So if fl_blocker is NULL, it is safe to
+	 * locklessly check if fl_blocked_requests is empty.  If both
+	 * of these checks succeed, there is no need to take the lock.
+	 * However, some other thread might have only *just* set
+	 * fl_blocker to NULL and it about to send a wakeup on
+	 * fl_wait, so we mustn't return too soon or we might free waiter
+	 * before that wakeup can be sent.  So take the fl_wait.lock
+	 * to serialize with the wakeup in __locks_wake_up_blocks().
+	 */
+	if (waiter->fl_blocker == NULL) {
+		spin_lock(&waiter->fl_wait.lock);
+		if (waiter->fl_blocker == NULL &&
+		    list_empty(&waiter->fl_blocked_requests)) {
+			spin_unlock(&waiter->fl_wait.lock);
+			return status;
+		}
+		spin_unlock(&waiter->fl_wait.lock);
+	}
 	spin_lock(&blocked_lock_lock);
 	if (waiter->fl_blocker)
 		status = 0;
-- 
2.24.1



  reply	other threads:[~2020-03-10 22:07 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-08 14:03 [locks] 6d390e4b5d: will-it-scale.per_process_ops -96.6% regression kernel test robot
2020-03-09 14:36 ` Jeff Layton
2020-03-09 15:52   ` Linus Torvalds
2020-03-09 17:22     ` Jeff Layton
2020-03-09 19:09       ` Jeff Layton
2020-03-09 19:53         ` Jeff Layton
2020-03-09 21:42         ` NeilBrown
2020-03-09 21:58           ` Jeff Layton
2020-03-10  7:52             ` kernel test robot
2020-03-09 22:11           ` Jeff Layton
2020-03-10  3:24             ` yangerkun
2020-03-10  7:54               ` kernel test robot
2020-03-10 12:52               ` Jeff Layton
2020-03-10 14:18                 ` yangerkun
2020-03-10 15:06                   ` Jeff Layton
2020-03-10 17:27                 ` Jeff Layton
2020-03-10 21:01                   ` NeilBrown
2020-03-10 21:14                     ` Jeff Layton
2020-03-10 21:21                       ` NeilBrown
2020-03-10 21:47                         ` Linus Torvalds
2020-03-10 22:07                           ` Jeff Layton [this message]
2020-03-10 22:31                             ` Linus Torvalds
2020-03-11 22:22                               ` NeilBrown
2020-03-12  0:38                                 ` Linus Torvalds
2020-03-12  4:42                                   ` NeilBrown
2020-03-12 12:31                                     ` Jeff Layton
2020-03-12 22:19                                       ` NeilBrown
2020-03-14  1:11                                         ` Jeff Layton
2020-03-12 16:07                                     ` Linus Torvalds
2020-03-14  1:31                                       ` Jeff Layton
2020-03-14  2:31                                         ` NeilBrown
2020-03-14 15:58                                           ` Linus Torvalds
2020-03-15 13:54                                             ` Jeff Layton
2020-03-16  5:06                                               ` NeilBrown
2020-03-16 11:07                                                 ` Jeff Layton
2020-03-16 17:26                                                   ` Linus Torvalds
2020-03-17  1:41                                                     ` yangerkun
2020-03-17 14:05                                                       ` yangerkun
2020-03-17 16:07                                                         ` Jeff Layton
2020-03-18  1:09                                                           ` yangerkun
2020-03-19 17:51                                                     ` Jeff Layton
2020-03-19 19:23                                                       ` Linus Torvalds
2020-03-19 19:24                                                         ` Jeff Layton
2020-03-19 19:35                                                           ` Linus Torvalds
2020-03-19 20:10                                                             ` Jeff Layton
2020-03-16 22:45                                                   ` NeilBrown
2020-03-17 15:59                                                     ` Jeff Layton
2020-03-17 21:27                                                       ` NeilBrown
2020-03-18  5:12                                                   ` kernel test robot
2020-03-16  4:26                                             ` NeilBrown
2020-03-11  1:57                     ` yangerkun
2020-03-11 12:52                       ` Jeff Layton
2020-03-11 13:26                         ` yangerkun
2020-03-11 22:15                       ` NeilBrown
2020-03-10  7:50           ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0066a9f150a55c13fcc750f6e657deae4ebdef97.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=bfields@fieldses.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@lists.01.org \
    --cc=neilb@suse.de \
    --cc=rong.a.chen@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yangerkun@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).