From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Joel Fernandes <joel@joelfernandes.org>,
Neeraj Upadhyay <neeraj.iitr10@gmail.com>
Cc: linux-kernel@vger.kernel.org,
Josh Triplett <josh@joshtriplett.org>,
Lai Jiangshan <jiangshanlai@gmail.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
rcu@vger.kernel.org, Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [RFC 0/2] srcu: Remove pre-flip memory barrier
Date: Tue, 20 Dec 2022 22:52:14 -0500 [thread overview]
Message-ID: <d010a8ca-79a4-bd25-dff1-cb7dee627365@efficios.com> (raw)
In-Reply-To: <B9B73CDE-4C2C-4BC6-A23C-A59C22AD2EB1@joelfernandes.org>
On 2022-12-20 15:55, Joel Fernandes wrote:
>
>
>> On Dec 20, 2022, at 1:29 PM, Joel Fernandes <joel@joelfernandes.org> wrote:
>>
>>
>>
>>>> On Dec 20, 2022, at 1:13 PM, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
>>>>
>>>> On 2022-12-20 13:05, Joel Fernandes wrote:
>>>> Hi Mathieu,
>>>>> On Tue, Dec 20, 2022 at 5:00 PM Mathieu Desnoyers
>>>>> <mathieu.desnoyers@efficios.com> wrote:
>>>>>
>>>>> On 2022-12-19 20:04, Joel Fernandes wrote:
>>>>>>> On Mon, Dec 19, 2022 at 7:55 PM Joel Fernandes <joel@joelfernandes.org> wrote:
>>>> [...]
>>>>>>>> On a 64-bit system, where 64-bit counters are used, AFAIU this need to
>>>>>>>> be exactly 2^64 read-side critical sections.
>>>>>>>
>>>>>>> Yes, but what about 32-bit systems?
>>>>>
>>>>> The overflow indeed happens after 2^32 increments, just like seqlock.
>>>>> The question we need to ask is therefore: if 2^32 is good enough for
>>>>> seqlock, why isn't it good enough for SRCU ?
>>>> I think Paul said wrap around does happen with SRCU on 32-bit but I'll
>>>> let him talk more about it. If 32-bit is good enough, let us also drop
>>>> the size of the counters for 64-bit then?
>>>>>>>> There are other synchronization algorithms such as seqlocks which are
>>>>>>>> quite happy with much less protection against overflow (using a 32-bit
>>>>>>>> counter even on 64-bit architectures).
>>>>>>>
>>>>>>> The seqlock is an interesting point.
>>>>>>>
>>>>>>>> For practical purposes, I suspect this issue is really just theoretical.
>>>>>>>
>>>>>>> I have to ask, what is the benefit of avoiding a flip and scanning
>>>>>>> active readers? Is the issue about grace period delay or performance?
>>>>>>> If so, it might be worth prototyping that approach and measuring using
>>>>>>> rcutorture/rcuscale. If there is significant benefit to current
>>>>>>> approach, then IMO it is worth exploring.
>>>>>
>>>>> The main benefit I expect is improved performance of the grace period
>>>>> implementation in common cases where there are few or no readers
>>>>> present, especially on machines with many cpus.
>>>>>
>>>>> It allows scanning both periods (0/1) for each cpu within the same pass,
>>>>> therefore loading both period's unlock counters sitting in the same
>>>>> cache line at once (improved locality), and then loading both period's
>>>>> lock counters, also sitting in the same cache line.
>>>>>
>>>>> It also allows skipping the period flip entirely if there are no readers
>>>>> present, which is an -arguably- tiny performance improvement as well.
>>>> The issue of counter wrap aside, what if a new reader always shows up
>>>> in the active index being scanned, then can you not delay the GP
>>>> indefinitely? It seems like writer-starvation is possible then (sure
>>>> it is possible also with preemption after reader-index-sampling, but
>>>> scanning active index deliberately will make that worse). Seqlock does
>>>> not have such writer starvation just because the writer does not care
>>>> about what the readers are doing.
>>>
>>> No, it's not possible for "current index" readers to starve the g.p. with the side-rcu scheme, because the initial pass (sampling both periods) only opportunistically skips flipping the period if there happens to be no readers in both periods.
>>>
>>> If there are readers in the "non-current" period, the grace period waits for them.
>>>
>>> If there are readers in the "current" period, it flips the period and then waits for them.
>>
>> Ok glad you already do that, this is what I was sort of leaning at in my previous email as well, that is doing a hybrid approach. Sorry I did not know the details of your side-RCU to know you were already doing something like that.
>>
>>>
>>>> That said, the approach of scanning both counters does seem attractive
>>>> for when there are no readers, for the reasons you mentioned. Maybe a
>>>> heuristic to count the number of readers might help? If we are not
>>>> reader-heavy, then scan both. Otherwise, just scan the inactive ones,
>>>> and also couple that heuristic with the number of CPUs. I am
>>>> interested in working on such a design with you! Let us do it and
>>>> prototype/measure. ;-)
>>>
>>> Considering that it would add extra complexity, I'm unsure what that extra heuristic would improve over just scanning both periods in the first pass.
>>
>> Makes sense, I think you indirectly implement a form of heuristic already by flipping in case scanning both was not fruitful.
>>
>>> I'll be happy to work with you on such a design :) I think we can borrow quite a few concepts from side-rcu for this. Please be aware that my time is limited though, as I'm currently supposed to be on vacation. :)
>>
>> Oh, I was more referring to after the holidays. I am also starting vacation soon and limited In cycles ;-). It is probably better to enjoy the holidays and come back to this after.
>>
>> I do want to finish my memory barrier studies of SRCU over the holidays since I have been deep in the hole with that already. Back to the post flip memory barrier here since I think now even that might not be needed…
>
> In my view, the mb between the totaling of unlocks and totaling of locks serves as the mb that is required to enforce the GP guarantee, which I think is what Mathieu is referring to.
>
No, AFAIU you also need barriers at the beginning and end of synchronize_srcu to provide those guarantees:
* There are memory-ordering constraints implied by synchronize_srcu().
Need for a barrier at the end of synchronize_srcu():
* On systems with more than one CPU, when synchronize_srcu() returns,
* each CPU is guaranteed to have executed a full memory barrier since
* the end of its last corresponding SRCU read-side critical section
* whose beginning preceded the call to synchronize_srcu().
Need for a barrier at the beginning of synchronize_srcu():
* In addition,
* each CPU having an SRCU read-side critical section that extends beyond
* the return from synchronize_srcu() is guaranteed to have executed a
* full memory barrier after the beginning of synchronize_srcu() and before
* the beginning of that SRCU read-side critical section. Note that these
* guarantees include CPUs that are offline, idle, or executing in user mode,
* as well as CPUs that are executing in the kernel.
Thanks,
Mathieu
> Neeraj, do you agree?
>
> Thanks.
>
>
>
>
>
>>
>> Cheers,
>>
>> - Joel
>>
>>
>>>
>>> Thanks,
>>>
>>> Mathieu
>>>
>>> --
>>> Mathieu Desnoyers
>>> EfficiOS Inc.
>>> https://www.efficios.com
>>>
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
next prev parent reply other threads:[~2022-12-21 3:52 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-18 19:13 [RFC 0/2] srcu: Remove pre-flip memory barrier Joel Fernandes (Google)
2022-12-18 19:13 ` [RFC 1/2] srcu: Remove comment about prior read lock counts Joel Fernandes (Google)
2022-12-18 21:08 ` Mathieu Desnoyers
2022-12-18 21:19 ` Joel Fernandes
2022-12-18 19:13 ` [RFC 2/2] srcu: Remove memory barrier "E" as it is not required Joel Fernandes (Google)
2022-12-18 21:42 ` Frederic Weisbecker
2022-12-18 23:26 ` Joel Fernandes
2022-12-19 0:30 ` Joel Fernandes
2022-12-18 20:57 ` [RFC 0/2] srcu: Remove pre-flip memory barrier Mathieu Desnoyers
2022-12-18 21:30 ` Joel Fernandes
2022-12-18 23:26 ` Paul E. McKenney
2022-12-18 23:38 ` Mathieu Desnoyers
2022-12-19 0:04 ` Joel Fernandes
2022-12-19 0:24 ` Joel Fernandes
2022-12-19 1:50 ` Mathieu Desnoyers
2022-12-20 0:55 ` Joel Fernandes
2022-12-20 1:04 ` Joel Fernandes
2022-12-20 17:00 ` Mathieu Desnoyers
2022-12-20 18:05 ` Joel Fernandes
2022-12-20 18:14 ` Mathieu Desnoyers
2022-12-20 18:29 ` Joel Fernandes
2022-12-20 19:01 ` Mathieu Desnoyers
2022-12-20 19:06 ` Joel Fernandes
2022-12-20 23:05 ` Frederic Weisbecker
2022-12-20 23:46 ` Joel Fernandes
2022-12-21 0:27 ` Frederic Weisbecker
2022-12-20 22:57 ` Frederic Weisbecker
2022-12-21 3:34 ` Mathieu Desnoyers
2022-12-21 11:59 ` Frederic Weisbecker
2022-12-21 17:11 ` Mathieu Desnoyers
2022-12-22 12:40 ` Frederic Weisbecker
2022-12-22 13:19 ` Joel Fernandes
2022-12-22 16:43 ` Paul E. McKenney
2022-12-22 18:19 ` Joel Fernandes
2022-12-22 18:53 ` Paul E. McKenney
2022-12-22 18:56 ` Joel Fernandes
2022-12-22 19:45 ` Paul E. McKenney
2022-12-23 4:43 ` Joel Fernandes
2022-12-23 16:12 ` Joel Fernandes
2022-12-23 18:15 ` Paul E. McKenney
2022-12-23 20:10 ` Joel Fernandes
2022-12-23 20:52 ` Paul E. McKenney
2022-12-20 20:55 ` Joel Fernandes
2022-12-21 3:52 ` Mathieu Desnoyers [this message]
2022-12-21 5:02 ` Joel Fernandes
2022-12-21 0:07 ` Frederic Weisbecker
2022-12-21 3:47 ` Mathieu Desnoyers
2022-12-20 4:07 ` Joel Fernandes
2022-12-20 12:34 ` Frederic Weisbecker
2022-12-20 12:40 ` Frederic Weisbecker
2022-12-20 13:44 ` Joel Fernandes
2022-12-20 14:07 ` Frederic Weisbecker
2022-12-20 14:20 ` Joel Fernandes
2022-12-20 22:44 ` Frederic Weisbecker
2022-12-21 0:15 ` Joel Fernandes
2022-12-21 0:49 ` Frederic Weisbecker
2022-12-21 0:58 ` Frederic Weisbecker
2022-12-21 3:43 ` Mathieu Desnoyers
2022-12-21 4:26 ` Joel Fernandes
2022-12-21 14:04 ` Frederic Weisbecker
2022-12-21 16:30 ` Mathieu Desnoyers
2022-12-21 12:11 ` Frederic Weisbecker
2022-12-21 17:20 ` Mathieu Desnoyers
2022-12-21 18:18 ` Joel Fernandes
2022-12-21 2:41 ` Joel Fernandes
2022-12-21 11:26 ` Frederic Weisbecker
2022-12-21 16:02 ` Boqun Feng
2022-12-21 17:30 ` Frederic Weisbecker
2022-12-21 19:33 ` Joel Fernandes
2022-12-21 19:57 ` Joel Fernandes
2022-12-21 20:19 ` Boqun Feng
2022-12-22 12:16 ` Frederic Weisbecker
2022-12-22 12:24 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d010a8ca-79a4-bd25-dff1-cb7dee627365@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=jiangshanlai@gmail.com \
--cc=joel@joelfernandes.org \
--cc=josh@joshtriplett.org \
--cc=linux-kernel@vger.kernel.org \
--cc=neeraj.iitr10@gmail.com \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).