All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Waiman Long <longman@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@redhat.com>, Will Deacon <will.deacon@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	huang ying <huang.ying.caritas@gmail.com>
Subject: Re: [PATCH v4 14/16] locking/rwsem: Guard against making count negative
Date: Wed, 24 Apr 2019 19:01:48 +0200	[thread overview]
Message-ID: <20190424170148.GR12232@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <51589ac0-3e1f-040e-02bf-b6de77cbda1d@redhat.com>

On Wed, Apr 24, 2019 at 12:49:05PM -0400, Waiman Long wrote:
> On 4/24/19 3:09 AM, Peter Zijlstra wrote:
> > On Tue, Apr 23, 2019 at 03:12:16PM -0400, Waiman Long wrote:
> >> That is true in general, but doing preempt_disable/enable across
> >> function boundary is ugly and prone to further problems down the road.
> > We do worse things in this code, and the thing Linus proposes is
> > actually quite simple, something like so:
> >
> > ---
> > --- a/kernel/locking/rwsem.c
> > +++ b/kernel/locking/rwsem.c
> > @@ -912,7 +904,7 @@ rwsem_down_read_slowpath(struct rw_semap
> >  			raw_spin_unlock_irq(&sem->wait_lock);
> >  			break;
> >  		}
> > -		schedule();
> > +		schedule_preempt_disabled();
> >  		lockevent_inc(rwsem_sleep_reader);
> >  	}
> >  
> > @@ -1121,6 +1113,7 @@ static struct rw_semaphore *rwsem_downgr
> >   */
> >  inline void __down_read(struct rw_semaphore *sem)
> >  {
> > +	preempt_disable();
> >  	if (unlikely(atomic_long_fetch_add_acquire(RWSEM_READER_BIAS,
> >  			&sem->count) & RWSEM_READ_FAILED_MASK)) {
> >  		rwsem_down_read_slowpath(sem, TASK_UNINTERRUPTIBLE);
> > @@ -1129,10 +1122,12 @@ inline void __down_read(struct rw_semaph
> >  	} else {
> >  		rwsem_set_reader_owned(sem);
> >  	}
> > +	preempt_enable();
> >  }
> >  
> >  static inline int __down_read_killable(struct rw_semaphore *sem)
> >  {
> > +	preempt_disable();
> >  	if (unlikely(atomic_long_fetch_add_acquire(RWSEM_READER_BIAS,
> >  			&sem->count) & RWSEM_READ_FAILED_MASK)) {
> >  		if (IS_ERR(rwsem_down_read_slowpath(sem, TASK_KILLABLE)))
> > @@ -1142,6 +1137,7 @@ static inline int __down_read_killable(s
> >  	} else {
> >  		rwsem_set_reader_owned(sem);
> >  	}
> > +	preempt_enable();
> >  	return 0;
> >  }
> >  
> 
> Making that change will help the slowpath to has less preemption points.

That doesn't matter, right? Either it blocks or it goes through quickly.

If you're worried about a parituclar spot we can easily put in explicit
preemption points.

> For an uncontended rwsem, this offers no real benefit. Adding
> preempt_disable() is more complicated than I originally thought.

I'm not sure I get your objection?

> Maybe we are too paranoid about the possibility of a large number of
> preemptions happening just at the right moment. If p is the probably of
> a preemption in the middle of the inc-check-dec sequence, which I have
> already moved as close to each other as possible. We are talking a
> probability of p^32768. Since p will be really small, the compound
> probability will be infinitesimally small.

Sure; but we run on many millions of machines every second, so the
actual accumulated chance of it happening eventually is still fairly
significant.

> So I would like to not do preemption now for the current patchset. We
> can restart the discussion later on if there is a real concern that it
> may actually happen. Please let me know if you still want to add
> preempt_disable() for the read lock.

I like provably correct schemes over prayers.

As you noted, distros don't usually ship with PREEMPT=y and therefore
will not be bothered much by any of this.

The old scheme basically worked by the fact that the total supported
reader count was higher than the number of addressable pages in the
system and therefore the overflow could not happen.

We now transition to number of CPUs, and for that we pay a little price
with PREEMPT=y kernels. Either that or cmpxchg.

  reply	other threads:[~2019-04-24 17:02 UTC|newest]

Thread overview: 112+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-13 17:22 [PATCH v4 00/16] locking/rwsem: Rwsem rearchitecture part 2 Waiman Long
2019-04-13 17:22 ` [PATCH v4 01/16] locking/rwsem: Prevent unneeded warning during locking selftest Waiman Long
2019-04-18  8:04   ` [tip:locking/core] " tip-bot for Waiman Long
2019-04-13 17:22 ` [PATCH v4 02/16] locking/rwsem: Make owner available even if !CONFIG_RWSEM_SPIN_ON_OWNER Waiman Long
2019-04-13 17:22 ` [PATCH v4 03/16] locking/rwsem: Remove rwsem_wake() wakeup optimization Waiman Long
2019-04-13 17:22 ` [PATCH v4 04/16] locking/rwsem: Implement a new locking scheme Waiman Long
2019-04-16 13:22   ` Peter Zijlstra
2019-04-16 13:32     ` Waiman Long
2019-04-16 14:18       ` Peter Zijlstra
2019-04-16 14:42         ` Peter Zijlstra
2019-04-13 17:22 ` [PATCH v4 05/16] locking/rwsem: Merge rwsem.h and rwsem-xadd.c into rwsem.c Waiman Long
2019-04-13 17:22 ` [PATCH v4 06/16] locking/rwsem: Code cleanup after files merging Waiman Long
2019-04-16 16:01   ` Peter Zijlstra
2019-04-16 16:17     ` Peter Zijlstra
2019-04-16 19:45       ` Waiman Long
2019-04-13 17:22 ` [PATCH v4 07/16] locking/rwsem: Implement lock handoff to prevent lock starvation Waiman Long
2019-04-16 14:12   ` Peter Zijlstra
2019-04-16 20:26     ` Waiman Long
2019-04-16 21:07       ` Waiman Long
2019-04-17  7:13         ` Peter Zijlstra
2019-04-17 16:22           ` Waiman Long
2019-04-16 15:49   ` Peter Zijlstra
2019-04-16 16:15     ` Peter Zijlstra
2019-04-16 18:41       ` Waiman Long
2019-04-16 18:16     ` Waiman Long
2019-04-16 18:32       ` Peter Zijlstra
2019-04-17  7:35       ` Peter Zijlstra
2019-04-17 16:35         ` Waiman Long
2019-04-17  8:05       ` Peter Zijlstra
2019-04-17 16:39         ` Waiman Long
2019-04-18  8:22           ` Peter Zijlstra
2019-04-17  8:17   ` Peter Zijlstra
2019-04-13 17:22 ` [PATCH v4 08/16] locking/rwsem: Make rwsem_spin_on_owner() return owner state Waiman Long
2019-04-17  9:00   ` Peter Zijlstra
2019-04-17 16:42     ` Waiman Long
2019-04-17 10:19   ` Peter Zijlstra
2019-04-17 16:53     ` Waiman Long
2019-04-17 12:41   ` Peter Zijlstra
2019-04-17 12:47     ` Peter Zijlstra
2019-04-17 18:29       ` Waiman Long
2019-04-18  8:39         ` Peter Zijlstra
2019-04-17 13:00     ` Peter Zijlstra
2019-04-17 18:50       ` Waiman Long
2019-04-13 17:22 ` [PATCH v4 09/16] locking/rwsem: Ensure an RT task will not spin on reader Waiman Long
2019-04-17 13:18   ` Peter Zijlstra
2019-04-17 18:47     ` Waiman Long
2019-04-18  8:52       ` Peter Zijlstra
2019-04-18 13:27         ` Waiman Long
2019-04-13 17:22 ` [PATCH v4 10/16] locking/rwsem: Wake up almost all readers in wait queue Waiman Long
2019-04-16 16:50   ` Davidlohr Bueso
2019-04-16 17:37     ` Waiman Long
2019-04-17 13:39   ` Peter Zijlstra
2019-04-17 17:16     ` Waiman Long
2019-04-13 17:22 ` [PATCH v4 11/16] locking/rwsem: Enable readers spinning on writer Waiman Long
2019-04-17 13:56   ` Peter Zijlstra
2019-04-17 17:34     ` Waiman Long
2019-04-18  8:57       ` Peter Zijlstra
2019-04-18 14:35         ` Waiman Long
2019-04-17 13:58   ` Peter Zijlstra
2019-04-17 17:45     ` Waiman Long
2019-04-18  9:00       ` Peter Zijlstra
2019-04-18 13:40         ` Waiman Long
2019-04-17 14:05   ` Peter Zijlstra
2019-04-17 17:51     ` Waiman Long
2019-04-18  9:11       ` Peter Zijlstra
2019-04-18 14:37         ` Waiman Long
2019-04-13 17:22 ` [PATCH v4 12/16] locking/rwsem: Enable time-based spinning on reader-owned rwsem Waiman Long
2019-04-18 13:06   ` Peter Zijlstra
2019-04-18 15:15     ` Waiman Long
2019-04-19  7:56       ` Peter Zijlstra
2019-04-19 14:33         ` Waiman Long
2019-04-19 15:36           ` Waiman Long
2019-04-13 17:22 ` [PATCH v4 13/16] locking/rwsem: Add more rwsem owner access helpers Waiman Long
2019-04-13 17:22 ` [PATCH v4 14/16] locking/rwsem: Guard against making count negative Waiman Long
2019-04-18 13:51   ` Peter Zijlstra
2019-04-18 14:08     ` Waiman Long
2019-04-18 14:30       ` Peter Zijlstra
2019-04-18 14:40       ` Peter Zijlstra
2019-04-18 14:54         ` Waiman Long
2019-04-19 10:26           ` Peter Zijlstra
2019-04-19 12:02             ` Peter Zijlstra
2019-04-19 13:03               ` Peter Zijlstra
2019-04-19 13:15                 ` Peter Zijlstra
2019-04-19 19:39                   ` Waiman Long
2019-04-21 21:07                     ` Waiman Long
2019-04-23 14:17                       ` Peter Zijlstra
2019-04-23 14:31                         ` Waiman Long
2019-04-23 16:27                         ` Linus Torvalds
2019-04-23 19:12                           ` Waiman Long
2019-04-23 19:34                             ` Peter Zijlstra
2019-04-23 19:41                               ` Waiman Long
2019-04-23 19:55                                 ` [PATCH] bpf: Fix preempt_enable_no_resched() abuse Peter Zijlstra
2019-04-23 20:03                                   ` [PATCH] trace: " Peter Zijlstra
2019-04-23 23:58                                     ` Steven Rostedt
2019-04-29  6:39                                     ` [tip:sched/core] " tip-bot for Peter Zijlstra
2019-04-29 13:31                                       ` Steven Rostedt
2019-04-29 14:08                                         ` Ingo Molnar
2019-04-23 20:27                                   ` [PATCH] bpf: " Linus Torvalds
2019-04-23 20:35                                     ` Peter Zijlstra
2019-04-23 20:45                                       ` Linus Torvalds
2019-04-24 13:19                                       ` Peter Zijlstra
2019-04-25 21:23                                   ` Alexei Starovoitov
2019-04-26  7:14                                     ` Peter Zijlstra
2019-04-24  7:09                             ` [PATCH v4 14/16] locking/rwsem: Guard against making count negative Peter Zijlstra
2019-04-24 16:49                               ` Waiman Long
2019-04-24 17:01                                 ` Peter Zijlstra [this message]
2019-04-24 17:10                                   ` Waiman Long
2019-04-24 17:56                                   ` Linus Torvalds
2019-04-13 17:22 ` [PATCH v4 15/16] locking/rwsem: Merge owner into count on x86-64 Waiman Long
2019-04-18 14:28   ` Peter Zijlstra
2019-04-18 14:40     ` Waiman Long
2019-04-13 17:22 ` [PATCH v4 16/16] locking/rwsem: Remove redundant computation of writer lock word Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190424170148.GR12232@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=dave@stgolabs.net \
    --cc=huang.ying.caritas@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.