linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Jeff Layton <jlayton@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Cc: yangerkun <yangerkun@huawei.com>,
	kernel test robot <rong.a.chen@intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	lkp@lists.01.org, Bruce Fields <bfields@fieldses.org>,
	Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [locks] 6d390e4b5d: will-it-scale.per_process_ops -96.6% regression
Date: Sat, 14 Mar 2020 13:31:03 +1100	[thread overview]
Message-ID: <877dznu0pk.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <f000e352d9e103b3ade3506aac225920420d2323.camel@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 4647 bytes --]

On Fri, Mar 13 2020, Jeff Layton wrote:

> On Thu, 2020-03-12 at 09:07 -0700, Linus Torvalds wrote:
>> On Wed, Mar 11, 2020 at 9:42 PM NeilBrown <neilb@suse.de> wrote:
>> > It seems that test_and_set_bit_lock() is the preferred way to handle
>> > flags when memory ordering is important
>> 
>> That looks better.
>> 
>> The _preferred_ way is actually the one I already posted: do a
>> "smp_store_release()" to store the flag (like a NULL pointer), and a
>> smp_load_acquire() to load it.
>> 
>> That's basically optimal on most architectures (all modern ones -
>> there are bad architectures from before people figured out that
>> release/acquire is better than separate memory barriers), not needing
>> any atomics and only minimal memory ordering.
>> 
>> I wonder if a special flags value (keeping it "unsigned int" to avoid
>> the issue Jeff pointed out) might be acceptable?
>> 
>> IOW, could we do just
>> 
>>         smp_store_release(&waiter->fl_flags, FL_RELEASED);
>> 
>> to say that we're done with the lock? Or do people still look at and
>> depend on the flag values at that point?
>
> I think nlmsvc_grant_block does. We could probably work around it
> there, but we'd need to couple this change with some clear
> documentation to make it clear that you can't rely on fl_flags after
> locks_delete_block returns.
>
> If avoiding new locks is preferred here (and I'm fine with that), then
> maybe we should just go with the patch you sent originally (along with
> changing the waiters to wait on fl_blocked_member going empty instead
> of the fl_blocker going NULL)?

I agree.  I've poked at this for a while and come to the conclusion that
I cannot really come up with anything that is structurally better than
your patch.
The idea of list_del_init_release() and list_empty_acquire() is growing
on me though.  See below.

list_empty_acquire() might be appropriate for waitqueue_active(), which
is documented as requiring a memory barrier, but in practice seems to
often be used without one.

But I'm happy for you to go with your patch that changes all the wait
calls.

NeilBrown



diff --git a/fs/locks.c b/fs/locks.c
index 426b55d333d5..2e5eb677c324 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -174,6 +174,20 @@
 
 #include <linux/uaccess.h>
 
+/* Should go in list.h */
+static inline int list_empty_acquire(const struct list_head *head)
+{
+	return smp_load_acquire(&head->next) == head;
+}
+
+static inline void list_del_init_release(struct list_head *entry)
+{
+	__list_del_entry(entry);
+	entry->prev = entry;
+	smp_store_release(&entry->next, entry);
+}
+
+
 #define IS_POSIX(fl)	(fl->fl_flags & FL_POSIX)
 #define IS_FLOCK(fl)	(fl->fl_flags & FL_FLOCK)
 #define IS_LEASE(fl)	(fl->fl_flags & (FL_LEASE|FL_DELEG|FL_LAYOUT))
@@ -724,7 +738,6 @@ static void locks_delete_global_blocked(struct file_lock *waiter)
 static void __locks_delete_block(struct file_lock *waiter)
 {
 	locks_delete_global_blocked(waiter);
-	list_del_init(&waiter->fl_blocked_member);
 	waiter->fl_blocker = NULL;
 }
 
@@ -740,6 +753,11 @@ static void __locks_wake_up_blocks(struct file_lock *blocker)
 			waiter->fl_lmops->lm_notify(waiter);
 		else
 			wake_up(&waiter->fl_wait);
+		/*
+		 * Tell the world that we're done with it - see comment at
+		 * top of locks_delete_block().
+		 */
+		list_del_init_release(&waiter->fl_blocked_member);
 	}
 }
 
@@ -753,6 +771,25 @@ int locks_delete_block(struct file_lock *waiter)
 {
 	int status = -ENOENT;
 
+	/*
+	 * If fl_blocker is NULL, it won't be set again as this thread
+	 * "owns" the lock and is the only one that might try to claim
+	 * the lock.  So it is safe to test fl_blocker locklessly.
+	 * Also if fl_blocker is NULL, this waiter is not listed on
+	 * fl_blocked_requests for some lock, so no other request can
+	 * be added to the list of fl_blocked_requests for this
+	 * request.  So if fl_blocker is NULL, it is safe to
+	 * locklessly check if fl_blocked_requests is empty.  If both
+	 * of these checks succeed, there is no need to take the lock.
+	 * However, some other thread could still be in__locks_wake_up_blocks()
+	 * and may yet access 'waiter', so we cannot return and possibly
+	 * free the 'waiter' unless we check that __locks_wake_up_blocks()
+	 * is done.  For that we carefully test fl_blocked_member.
+	 */
+	if (waiter->fl_blocker == NULL &&
+	    list_empty(&waiter->fl_blocked_requests) &&
+	    list_empty_acquire(&waiter->fl_blocked_member))
+		return status;
 	spin_lock(&blocked_lock_lock);
 	if (waiter->fl_blocker)
 		status = 0;

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

  reply	other threads:[~2020-03-14  2:31 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-08 14:03 [locks] 6d390e4b5d: will-it-scale.per_process_ops -96.6% regression kernel test robot
2020-03-09 14:36 ` Jeff Layton
2020-03-09 15:52   ` Linus Torvalds
2020-03-09 17:22     ` Jeff Layton
2020-03-09 19:09       ` Jeff Layton
2020-03-09 19:53         ` Jeff Layton
2020-03-09 21:42         ` NeilBrown
2020-03-09 21:58           ` Jeff Layton
2020-03-10  7:52             ` kernel test robot
2020-03-09 22:11           ` Jeff Layton
2020-03-10  3:24             ` yangerkun
2020-03-10  7:54               ` kernel test robot
2020-03-10 12:52               ` Jeff Layton
2020-03-10 14:18                 ` yangerkun
2020-03-10 15:06                   ` Jeff Layton
2020-03-10 17:27                 ` Jeff Layton
2020-03-10 21:01                   ` NeilBrown
2020-03-10 21:14                     ` Jeff Layton
2020-03-10 21:21                       ` NeilBrown
2020-03-10 21:47                         ` Linus Torvalds
2020-03-10 22:07                           ` Jeff Layton
2020-03-10 22:31                             ` Linus Torvalds
2020-03-11 22:22                               ` NeilBrown
2020-03-12  0:38                                 ` Linus Torvalds
2020-03-12  4:42                                   ` NeilBrown
2020-03-12 12:31                                     ` Jeff Layton
2020-03-12 22:19                                       ` NeilBrown
2020-03-14  1:11                                         ` Jeff Layton
2020-03-12 16:07                                     ` Linus Torvalds
2020-03-14  1:31                                       ` Jeff Layton
2020-03-14  2:31                                         ` NeilBrown [this message]
2020-03-14 15:58                                           ` Linus Torvalds
2020-03-15 13:54                                             ` Jeff Layton
2020-03-16  5:06                                               ` NeilBrown
2020-03-16 11:07                                                 ` Jeff Layton
2020-03-16 17:26                                                   ` Linus Torvalds
2020-03-17  1:41                                                     ` yangerkun
2020-03-17 14:05                                                       ` yangerkun
2020-03-17 16:07                                                         ` Jeff Layton
2020-03-18  1:09                                                           ` yangerkun
2020-03-19 17:51                                                     ` Jeff Layton
2020-03-19 19:23                                                       ` Linus Torvalds
2020-03-19 19:24                                                         ` Jeff Layton
2020-03-19 19:35                                                           ` Linus Torvalds
2020-03-19 20:10                                                             ` Jeff Layton
2020-03-16 22:45                                                   ` NeilBrown
2020-03-17 15:59                                                     ` Jeff Layton
2020-03-17 21:27                                                       ` NeilBrown
2020-03-18  5:12                                                   ` kernel test robot
2020-03-16  4:26                                             ` NeilBrown
2020-03-11  1:57                     ` yangerkun
2020-03-11 12:52                       ` Jeff Layton
2020-03-11 13:26                         ` yangerkun
2020-03-11 22:15                       ` NeilBrown
2020-03-10  7:50           ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877dznu0pk.fsf@notabene.neil.brown.name \
    --to=neilb@suse.de \
    --cc=bfields@fieldses.org \
    --cc=jlayton@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@lists.01.org \
    --cc=rong.a.chen@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yangerkun@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).