linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Joel Fernandes <joel@joelfernandes.org>,
	Neeraj Upadhyay <neeraj.iitr10@gmail.com>
Cc: linux-kernel@vger.kernel.org,
	Josh Triplett <josh@joshtriplett.org>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	rcu@vger.kernel.org, Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [RFC 0/2] srcu: Remove pre-flip memory barrier
Date: Tue, 20 Dec 2022 22:52:14 -0500	[thread overview]
Message-ID: <d010a8ca-79a4-bd25-dff1-cb7dee627365@efficios.com> (raw)
In-Reply-To: <B9B73CDE-4C2C-4BC6-A23C-A59C22AD2EB1@joelfernandes.org>

On 2022-12-20 15:55, Joel Fernandes wrote:
> 
> 
>> On Dec 20, 2022, at 1:29 PM, Joel Fernandes <joel@joelfernandes.org> wrote:
>>
>> 
>>
>>>> On Dec 20, 2022, at 1:13 PM, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
>>>>
>>>> On 2022-12-20 13:05, Joel Fernandes wrote:
>>>> Hi Mathieu,
>>>>> On Tue, Dec 20, 2022 at 5:00 PM Mathieu Desnoyers
>>>>> <mathieu.desnoyers@efficios.com> wrote:
>>>>>
>>>>> On 2022-12-19 20:04, Joel Fernandes wrote:
>>>>>>> On Mon, Dec 19, 2022 at 7:55 PM Joel Fernandes <joel@joelfernandes.org> wrote:
>>>> [...]
>>>>>>>> On a 64-bit system, where 64-bit counters are used, AFAIU this need to
>>>>>>>> be exactly 2^64 read-side critical sections.
>>>>>>>
>>>>>>> Yes, but what about 32-bit systems?
>>>>>
>>>>> The overflow indeed happens after 2^32 increments, just like seqlock.
>>>>> The question we need to ask is therefore: if 2^32 is good enough for
>>>>> seqlock, why isn't it good enough for SRCU ?
>>>> I think Paul said wrap around does happen with SRCU on 32-bit but I'll
>>>> let him talk more about it. If 32-bit is good enough, let us also drop
>>>> the size of the counters for 64-bit then?
>>>>>>>> There are other synchronization algorithms such as seqlocks which are
>>>>>>>> quite happy with much less protection against overflow (using a 32-bit
>>>>>>>> counter even on 64-bit architectures).
>>>>>>>
>>>>>>> The seqlock is an interesting point.
>>>>>>>
>>>>>>>> For practical purposes, I suspect this issue is really just theoretical.
>>>>>>>
>>>>>>> I have to ask, what is the benefit of avoiding a flip and scanning
>>>>>>> active readers? Is the issue about grace period delay or performance?
>>>>>>> If so, it might be worth prototyping that approach and measuring using
>>>>>>> rcutorture/rcuscale. If there is significant benefit to current
>>>>>>> approach, then IMO it is worth exploring.
>>>>>
>>>>> The main benefit I expect is improved performance of the grace period
>>>>> implementation in common cases where there are few or no readers
>>>>> present, especially on machines with many cpus.
>>>>>
>>>>> It allows scanning both periods (0/1) for each cpu within the same pass,
>>>>> therefore loading both period's unlock counters sitting in the same
>>>>> cache line at once (improved locality), and then loading both period's
>>>>> lock counters, also sitting in the same cache line.
>>>>>
>>>>> It also allows skipping the period flip entirely if there are no readers
>>>>> present, which is an -arguably- tiny performance improvement as well.
>>>> The issue of counter wrap aside, what if a new reader always shows up
>>>> in the active index being scanned, then can you not delay the GP
>>>> indefinitely? It seems like writer-starvation is possible then (sure
>>>> it is possible also with preemption after reader-index-sampling, but
>>>> scanning active index deliberately will make that worse). Seqlock does
>>>> not have such writer starvation just because the writer does not care
>>>> about what the readers are doing.
>>>
>>> No, it's not possible for "current index" readers to starve the g.p. with the side-rcu scheme, because the initial pass (sampling both periods) only opportunistically skips flipping the period if there happens to be no readers in both periods.
>>>
>>> If there are readers in the "non-current" period, the grace period waits for them.
>>>
>>> If there are readers in the "current" period, it flips the period and then waits for them.
>>
>> Ok glad you already do that, this is what I was sort of leaning at in my previous email as well, that is doing a hybrid approach. Sorry I did not know the details of your side-RCU to know you were already doing something like that.
>>
>>>
>>>> That said, the approach of scanning both counters does seem attractive
>>>> for when there are no readers, for the reasons you mentioned. Maybe a
>>>> heuristic to count the number of readers might help? If we are not
>>>> reader-heavy, then scan both. Otherwise, just scan the inactive ones,
>>>> and also couple that heuristic with the number of CPUs. I am
>>>> interested in working on such a design with you! Let us do it and
>>>> prototype/measure. ;-)
>>>
>>> Considering that it would add extra complexity, I'm unsure what that extra heuristic would improve over just scanning both periods in the first pass.
>>
>> Makes sense, I think you indirectly implement a form of heuristic already by flipping in case scanning both was not fruitful.
>>
>>> I'll be happy to work with you on such a design :) I think we can borrow quite a few concepts from side-rcu for this. Please be aware that my time is limited though, as I'm currently supposed to be on vacation. :)
>>
>> Oh, I was more referring to after the holidays. I am also starting vacation soon and limited In cycles ;-). It is probably better to enjoy the holidays and come back to this after.
>>
>> I do want to finish my memory barrier studies of SRCU over the holidays since I have been deep in the hole with that already. Back to the post flip memory barrier here since I think now even that might not be needed…
> 
> In my view,  the mb between the totaling of unlocks and totaling of locks serves as the mb that is required to enforce the GP guarantee, which I think is what Mathieu is referring to.
> 

No, AFAIU you also need barriers at the beginning and end of synchronize_srcu to provide those guarantees:

  * There are memory-ordering constraints implied by synchronize_srcu().

Need for a barrier at the end of synchronize_srcu():

  * On systems with more than one CPU, when synchronize_srcu() returns,
  * each CPU is guaranteed to have executed a full memory barrier since
  * the end of its last corresponding SRCU read-side critical section
  * whose beginning preceded the call to synchronize_srcu().

Need for a barrier at the beginning of synchronize_srcu():

  * In addition,
  * each CPU having an SRCU read-side critical section that extends beyond
  * the return from synchronize_srcu() is guaranteed to have executed a
  * full memory barrier after the beginning of synchronize_srcu() and before
  * the beginning of that SRCU read-side critical section.  Note that these
  * guarantees include CPUs that are offline, idle, or executing in user mode,
  * as well as CPUs that are executing in the kernel.

Thanks,

Mathieu

> Neeraj, do you agree?
> 
> Thanks.
> 
> 
> 
> 
> 
>>
>> Cheers,
>>
>> - Joel
>>
>>
>>>
>>> Thanks,
>>>
>>> Mathieu
>>>
>>> -- 
>>> Mathieu Desnoyers
>>> EfficiOS Inc.
>>> https://www.efficios.com
>>>

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


  reply	other threads:[~2022-12-21  3:52 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-18 19:13 [RFC 0/2] srcu: Remove pre-flip memory barrier Joel Fernandes (Google)
2022-12-18 19:13 ` [RFC 1/2] srcu: Remove comment about prior read lock counts Joel Fernandes (Google)
2022-12-18 21:08   ` Mathieu Desnoyers
2022-12-18 21:19     ` Joel Fernandes
2022-12-18 19:13 ` [RFC 2/2] srcu: Remove memory barrier "E" as it is not required Joel Fernandes (Google)
2022-12-18 21:42   ` Frederic Weisbecker
2022-12-18 23:26     ` Joel Fernandes
2022-12-19  0:30       ` Joel Fernandes
2022-12-18 20:57 ` [RFC 0/2] srcu: Remove pre-flip memory barrier Mathieu Desnoyers
2022-12-18 21:30   ` Joel Fernandes
2022-12-18 23:26     ` Paul E. McKenney
2022-12-18 23:38     ` Mathieu Desnoyers
2022-12-19  0:04       ` Joel Fernandes
2022-12-19  0:24         ` Joel Fernandes
2022-12-19  1:50           ` Mathieu Desnoyers
2022-12-20  0:55             ` Joel Fernandes
2022-12-20  1:04               ` Joel Fernandes
2022-12-20 17:00                 ` Mathieu Desnoyers
2022-12-20 18:05                   ` Joel Fernandes
2022-12-20 18:14                     ` Mathieu Desnoyers
2022-12-20 18:29                       ` Joel Fernandes
2022-12-20 19:01                         ` Mathieu Desnoyers
2022-12-20 19:06                           ` Joel Fernandes
2022-12-20 23:05                             ` Frederic Weisbecker
2022-12-20 23:46                               ` Joel Fernandes
2022-12-21  0:27                                 ` Frederic Weisbecker
2022-12-20 22:57                           ` Frederic Weisbecker
2022-12-21  3:34                             ` Mathieu Desnoyers
2022-12-21 11:59                               ` Frederic Weisbecker
2022-12-21 17:11                                 ` Mathieu Desnoyers
2022-12-22 12:40                                   ` Frederic Weisbecker
2022-12-22 13:19                                     ` Joel Fernandes
2022-12-22 16:43                                     ` Paul E. McKenney
2022-12-22 18:19                                       ` Joel Fernandes
2022-12-22 18:53                                         ` Paul E. McKenney
2022-12-22 18:56                                           ` Joel Fernandes
2022-12-22 19:45                                             ` Paul E. McKenney
2022-12-23  4:43                                               ` Joel Fernandes
2022-12-23 16:12                                                 ` Joel Fernandes
2022-12-23 18:15                                                   ` Paul E. McKenney
2022-12-23 20:10                                                     ` Joel Fernandes
2022-12-23 20:52                                                       ` Paul E. McKenney
2022-12-20 20:55                         ` Joel Fernandes
2022-12-21  3:52                           ` Mathieu Desnoyers [this message]
2022-12-21  5:02                             ` Joel Fernandes
2022-12-21  0:07                   ` Frederic Weisbecker
2022-12-21  3:47                     ` Mathieu Desnoyers
2022-12-20  4:07 ` Joel Fernandes
2022-12-20 12:34   ` Frederic Weisbecker
2022-12-20 12:40     ` Frederic Weisbecker
2022-12-20 13:44       ` Joel Fernandes
2022-12-20 14:07         ` Frederic Weisbecker
2022-12-20 14:20           ` Joel Fernandes
2022-12-20 22:44             ` Frederic Weisbecker
2022-12-21  0:15               ` Joel Fernandes
2022-12-21  0:49                 ` Frederic Weisbecker
2022-12-21  0:58                   ` Frederic Weisbecker
2022-12-21  3:43                     ` Mathieu Desnoyers
2022-12-21  4:26                       ` Joel Fernandes
2022-12-21 14:04                         ` Frederic Weisbecker
2022-12-21 16:30                         ` Mathieu Desnoyers
2022-12-21 12:11                       ` Frederic Weisbecker
2022-12-21 17:20                         ` Mathieu Desnoyers
2022-12-21 18:18                           ` Joel Fernandes
2022-12-21  2:41                   ` Joel Fernandes
2022-12-21 11:26                     ` Frederic Weisbecker
2022-12-21 16:02                       ` Boqun Feng
2022-12-21 17:30                         ` Frederic Weisbecker
2022-12-21 19:33                           ` Joel Fernandes
2022-12-21 19:57                             ` Joel Fernandes
2022-12-21 20:19                           ` Boqun Feng
2022-12-22 12:16                             ` Frederic Weisbecker
2022-12-22 12:24                               ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d010a8ca-79a4-bd25-dff1-cb7dee627365@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neeraj.iitr10@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).