linux-next.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* linux-next: build warnings after merge of the rcu tree
@ 2022-11-23  5:32 Stephen Rothwell
  2022-11-23  8:19 ` Leizhen (ThunderTown)
  0 siblings, 1 reply; 8+ messages in thread
From: Stephen Rothwell @ 2022-11-23  5:32 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Zhen Lei, Linux Kernel Mailing List, Linux Next Mailing List

[-- Attachment #1: Type: text/plain, Size: 662 bytes --]

Hi all,

After merging the rcu tree, today's linux-next build (htmldocs) produced
these warnings:

Documentation/RCU/stallwarn.rst:401: WARNING: Literal block expected; none found.
Documentation/RCU/stallwarn.rst:428: WARNING: Literal block expected; none found.
Documentation/RCU/stallwarn.rst:445: WARNING: Literal block expected; none found.
Documentation/RCU/stallwarn.rst:459: WARNING: Literal block expected; none found.
Documentation/RCU/stallwarn.rst:468: WARNING: Literal block expected; none found.

Introduced by commit

  3d2788ba4573 ("doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information")

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux-next: build warnings after merge of the rcu tree
  2022-11-23  5:32 linux-next: build warnings after merge of the rcu tree Stephen Rothwell
@ 2022-11-23  8:19 ` Leizhen (ThunderTown)
  2022-11-23 14:27   ` Paul E. McKenney
  0 siblings, 1 reply; 8+ messages in thread
From: Leizhen (ThunderTown) @ 2022-11-23  8:19 UTC (permalink / raw)
  To: Stephen Rothwell, Paul E. McKenney
  Cc: Linux Kernel Mailing List, Linux Next Mailing List



On 2022/11/23 13:32, Stephen Rothwell wrote:
> Hi all,
> 
> After merging the rcu tree, today's linux-next build (htmldocs) produced
> these warnings:
> 
> Documentation/RCU/stallwarn.rst:401: WARNING: Literal block expected; none found.
> Documentation/RCU/stallwarn.rst:428: WARNING: Literal block expected; none found.
> Documentation/RCU/stallwarn.rst:445: WARNING: Literal block expected; none found.
> Documentation/RCU/stallwarn.rst:459: WARNING: Literal block expected; none found.
> Documentation/RCU/stallwarn.rst:468: WARNING: Literal block expected; none found.
> 
> Introduced by commit
> 
>   3d2788ba4573 ("doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information")
> 

Strange thing, I specially executed make htmldocs before, unexpectedly did not
find these warnings.

I already know why. The literal block is not indented. I will post a new version to
Paul E. McKenney. Excuse me for causing trouble to everyone.

For example:
@@ -398,9 +398,9 @@ In kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y or booted with
 rcupdate.rcu_cpu_stall_cputime=1, the following additional information
 is supplied with each RCU CPU stall warning::

-rcu:          hardirqs   softirqs   csw/system
-rcu:  number:      624         45            0
-rcu: cputime:       69          1         2425   ==> 2500(ms)
+  rcu:          hardirqs   softirqs   csw/system
+  rcu:  number:      624         45            0
+  rcu: cputime:       69          1         2425   ==> 2500(ms)


-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux-next: build warnings after merge of the rcu tree
  2022-11-23  8:19 ` Leizhen (ThunderTown)
@ 2022-11-23 14:27   ` Paul E. McKenney
  0 siblings, 0 replies; 8+ messages in thread
From: Paul E. McKenney @ 2022-11-23 14:27 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Stephen Rothwell, Linux Kernel Mailing List, Linux Next Mailing List

On Wed, Nov 23, 2022 at 04:19:15PM +0800, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/11/23 13:32, Stephen Rothwell wrote:
> > Hi all,
> > 
> > After merging the rcu tree, today's linux-next build (htmldocs) produced
> > these warnings:
> > 
> > Documentation/RCU/stallwarn.rst:401: WARNING: Literal block expected; none found.
> > Documentation/RCU/stallwarn.rst:428: WARNING: Literal block expected; none found.
> > Documentation/RCU/stallwarn.rst:445: WARNING: Literal block expected; none found.
> > Documentation/RCU/stallwarn.rst:459: WARNING: Literal block expected; none found.
> > Documentation/RCU/stallwarn.rst:468: WARNING: Literal block expected; none found.
> > 
> > Introduced by commit
> > 
> >   3d2788ba4573 ("doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information")
> > 
> 
> Strange thing, I specially executed make htmldocs before, unexpectedly did not
> find these warnings.
> 
> I already know why. The literal block is not indented. I will post a new version to
> Paul E. McKenney. Excuse me for causing trouble to everyone.
> 
> For example:
> @@ -398,9 +398,9 @@ In kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y or booted with
>  rcupdate.rcu_cpu_stall_cputime=1, the following additional information
>  is supplied with each RCU CPU stall warning::
> 
> -rcu:          hardirqs   softirqs   csw/system
> -rcu:  number:      624         45            0
> -rcu: cputime:       69          1         2425   ==> 2500(ms)
> +  rcu:          hardirqs   softirqs   csw/system
> +  rcu:  number:      624         45            0
> +  rcu: cputime:       69          1         2425   ==> 2500(ms)

It is probably my fault during my wordsmithing, but I will happily
accept a new patch to replace the current one.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux-next: build warnings after merge of the rcu tree
  2020-11-26 15:12 ` Paul E. McKenney
@ 2020-11-26 17:28   ` Uladzislau Rezki
  0 siblings, 0 replies; 8+ messages in thread
From: Uladzislau Rezki @ 2020-11-26 17:28 UTC (permalink / raw)
  To: Paul E. McKenney, Stephen Rothwell
  Cc: Stephen Rothwell, Uladzislau Rezki (Sony),
	Linux Kernel Mailing List, Linux Next Mailing List

> On Thu, Nov 26, 2020 at 05:44:28PM +1100, Stephen Rothwell wrote:
> > Hi all,
> > 
> > After merging the rcu tree, today's linux-next build (htmldocs) produced
> > these warnings:
> > 
> > include/linux/rcupdate.h:872: warning: Excess function parameter 'ptr' description in 'kfree_rcu'
> > include/linux/rcupdate.h:872: warning: Excess function parameter 'rhf' description in 'kfree_rcu'
> > 
> > Introduced by commit
> > 
> >   beba8bdf2f16 ("rcu: Introduce kfree_rcu() single-argument macro")
> 
> Heh!  The documentation isn't dealing at all well with this situation.
> 
> Would one of the docbook experts have some advice, keeping in mind
> that kfree_rcu might have either one or two arguments?
> 
Indeed. The question is if the docbook is capable of describing such
macro usage, 1 or 2 args.

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux-next: build warnings after merge of the rcu tree
  2020-11-26  6:44 Stephen Rothwell
@ 2020-11-26 15:12 ` Paul E. McKenney
  2020-11-26 17:28   ` Uladzislau Rezki
  0 siblings, 1 reply; 8+ messages in thread
From: Paul E. McKenney @ 2020-11-26 15:12 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Uladzislau Rezki (Sony),
	Linux Kernel Mailing List, Linux Next Mailing List

On Thu, Nov 26, 2020 at 05:44:28PM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> After merging the rcu tree, today's linux-next build (htmldocs) produced
> these warnings:
> 
> include/linux/rcupdate.h:872: warning: Excess function parameter 'ptr' description in 'kfree_rcu'
> include/linux/rcupdate.h:872: warning: Excess function parameter 'rhf' description in 'kfree_rcu'
> 
> Introduced by commit
> 
>   beba8bdf2f16 ("rcu: Introduce kfree_rcu() single-argument macro")

Heh!  The documentation isn't dealing at all well with this situation.

Would one of the docbook experts have some advice, keeping in mind
that kfree_rcu might have either one or two arguments?

							Thanx, Paul

------------------------------------------------------------------------

/**
 * kfree_rcu() - kfree an object after a grace period.
 * @ptr: pointer to kfree for both single- and double-argument invocations.
 * @rhf: the name of the struct rcu_head within the type of @ptr,
 *       but only for double-argument invocations.
 *
 * Many rcu callbacks functions just call kfree() on the base structure.
 * These functions are trivial, but their size adds up, and furthermore
 * when they are used in a kernel module, that module must invoke the
 * high-latency rcu_barrier() function at module-unload time.
 *
 * The kfree_rcu() function handles this issue.  Rather than encoding a
 * function address in the embedded rcu_head structure, kfree_rcu() instead
 * encodes the offset of the rcu_head structure within the base structure.
 * Because the functions are not allowed in the low-order 4096 bytes of
 * kernel virtual memory, offsets up to 4095 bytes can be accommodated.
 * If the offset is larger than 4095 bytes, a compile-time error will
 * be generated in kvfree_rcu_arg_2(). If this error is triggered, you can
 * either fall back to use of call_rcu() or rearrange the structure to
 * position the rcu_head structure into the first 4096 bytes.
 *
 * Note that the allowable offset might decrease in the future, for example,
 * to allow something like kmem_cache_free_rcu().
 *
 * The BUILD_BUG_ON check must not involve any function calls, hence the
 * checks are done in macros here.
 */
#define kfree_rcu kvfree_rcu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* linux-next: build warnings after merge of the rcu tree
@ 2020-11-26  6:44 Stephen Rothwell
  2020-11-26 15:12 ` Paul E. McKenney
  0 siblings, 1 reply; 8+ messages in thread
From: Stephen Rothwell @ 2020-11-26  6:44 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki (Sony),
	Linux Kernel Mailing List, Linux Next Mailing List

[-- Attachment #1: Type: text/plain, Size: 431 bytes --]

Hi all,

After merging the rcu tree, today's linux-next build (htmldocs) produced
these warnings:

include/linux/rcupdate.h:872: warning: Excess function parameter 'ptr' description in 'kfree_rcu'
include/linux/rcupdate.h:872: warning: Excess function parameter 'rhf' description in 'kfree_rcu'

Introduced by commit

  beba8bdf2f16 ("rcu: Introduce kfree_rcu() single-argument macro")

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux-next: build warnings after merge of the rcu tree
  2020-11-10  6:59 Stephen Rothwell
@ 2020-11-10 15:21 ` Paul E. McKenney
  0 siblings, 0 replies; 8+ messages in thread
From: Paul E. McKenney @ 2020-11-10 15:21 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Linux Kernel Mailing List, Linux Next Mailing List,
	Peter Zijlstra, Mauro Carvalho Chehab, Jonathan Corbet

On Tue, Nov 10, 2020 at 05:59:52PM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> After merging the rcu tree, today's linux-next build (htmldocs)
> produced these warnings:
> 
> Documentation/RCU/Design/Requirements/Requirements.rst:119: WARNING: Malformed table.

My bad, apologies, queuing an alleged fix.

							Thanx, Paul

> +-----------------------------------------------------------------------+
> | **Quick Quiz**:                                                       |
> +-----------------------------------------------------------------------+
> | Wait a minute! You said that updaters can make useful forward         |
> | progress concurrently with readers, but pre-existing readers will     |
> | block synchronize_rcu()!!!                                        |
> | Just who are you trying to fool???                                    |
> +-----------------------------------------------------------------------+
> | **Answer**:                                                           |
> +-----------------------------------------------------------------------+
> | First, if updaters do not wish to be blocked by readers, they can use |
> | call_rcu() or kfree_rcu(), which will be discussed later.     |
> | Second, even when using synchronize_rcu(), the other update-side  |
> | code does run concurrently with readers, whether pre-existing or not. |
> +-----------------------------------------------------------------------+
> Documentation/RCU/Design/Requirements/Requirements.rst:178: WARNING: Malformed table.
> 
> +-----------------------------------------------------------------------+
> | **Quick Quiz**:                                                       |
> +-----------------------------------------------------------------------+
> | Why is the synchronize_rcu() on line 28 needed?                   |
> +-----------------------------------------------------------------------+
> | **Answer**:                                                           |
> +-----------------------------------------------------------------------+
> | Without that extra grace period, memory reordering could result in    |
> | do_something_dlm() executing do_something() concurrently with |
> | the last bits of recovery().                                      |
> +-----------------------------------------------------------------------+
> Documentation/RCU/Design/Requirements/Requirements.rst:289: WARNING: Malformed table.
> 
> +-----------------------------------------------------------------------+
> | **Quick Quiz**:                                                       |
> +-----------------------------------------------------------------------+
> | But rcu_assign_pointer() does nothing to prevent the two          |
> | assignments to ``p->a`` and ``p->b`` from being reordered. Can't that |
> | also cause problems?                                                  |
> +-----------------------------------------------------------------------+
> | **Answer**:                                                           |
> +-----------------------------------------------------------------------+
> | No, it cannot. The readers cannot see either of these two fields      |
> | until the assignment to ``gp``, by which time both fields are fully   |
> | initialized. So reordering the assignments to ``p->a`` and ``p->b``   |
> | cannot possibly cause any problems.                                   |
> +-----------------------------------------------------------------------+
> Documentation/RCU/Design/Requirements/Requirements.rst:430: WARNING: Malformed table.
> 
> +-----------------------------------------------------------------------+
> | **Quick Quiz**:                                                       |
> +-----------------------------------------------------------------------+
> | Without the rcu_dereference() or the rcu_access_pointer(),    |
> | what destructive optimizations might the compiler make use of?        |
> +-----------------------------------------------------------------------+
> | **Answer**:                                                           |
> +-----------------------------------------------------------------------+
> | Let's start with what happens to do_something_gp() if it fails to |
> | use rcu_dereference(). It could reuse a value formerly fetched    |
> | from this same pointer. It could also fetch the pointer from ``gp``   |
> | in a byte-at-a-time manner, resulting in *load tearing*, in turn      |
> | resulting a bytewise mash-up of two distinct pointer values. It might |
> | even use value-speculation optimizations, where it makes a wrong      |
> | guess, but by the time it gets around to checking the value, an       |
> | update has changed the pointer to match the wrong guess. Too bad      |
> | about any dereferences that returned pre-initialization garbage in    |
> | the meantime!                                                         |
> | For remove_gp_synchronous(), as long as all modifications to      |
> | ``gp`` are carried out while holding ``gp_lock``, the above           |
> | optimizations are harmless. However, ``sparse`` will complain if you  |
> | define ``gp`` with ``__rcu`` and then access it without using either  |
> | rcu_access_pointer() or rcu_dereference().                    |
> +-----------------------------------------------------------------------+
> Documentation/RCU/Design/Requirements/Requirements.rst:513: WARNING: Malformed table.
> 
> +-----------------------------------------------------------------------+
> | **Quick Quiz**:                                                       |
> +-----------------------------------------------------------------------+
> | Given that multiple CPUs can start RCU read-side critical sections at |
> | any time without any ordering whatsoever, how can RCU possibly tell   |
> | whether or not a given RCU read-side critical section starts before a |
> | given instance of synchronize_rcu()?                              |
> +-----------------------------------------------------------------------+
> | **Answer**:                                                           |
> +-----------------------------------------------------------------------+
> | If RCU cannot tell whether or not a given RCU read-side critical      |
> | section starts before a given instance of synchronize_rcu(), then |
> | it must assume that the RCU read-side critical section started first. |
> | In other words, a given instance of synchronize_rcu() can avoid   |
> | waiting on a given RCU read-side critical section only if it can      |
> | prove that synchronize_rcu() started first.                       |
> | A related question is “When rcu_read_lock() doesn't generate any  |
> | code, why does it matter how it relates to a grace period?” The       |
> | answer is that it is not the relationship of rcu_read_lock()      |
> | itself that is important, but rather the relationship of the code     |
> | within the enclosed RCU read-side critical section to the code        |
> | preceding and following the grace period. If we take this viewpoint,  |
> | then a given RCU read-side critical section begins before a given     |
> | grace period when some access preceding the grace period observes the |
> | effect of some access within the critical section, in which case none |
> | of the accesses within the critical section may observe the effects   |
> | of any access following the grace period.                             |
> |                                                                       |
> | As of late 2016, mathematical models of RCU take this viewpoint, for  |
> | example, see slides 62 and 63 of the `2016 LinuxCon                   |
> | EU <http://www2.rdrop.com/users/paulmck/scalability/paper/LinuxMM.201 |
> | 6.10.04c.LCE.pdf>`__                                                  |
> | presentation.                                                         |
> +-----------------------------------------------------------------------+
> Documentation/RCU/Design/Requirements/Requirements.rst:548: WARNING: Malformed table.
> 
> +-----------------------------------------------------------------------+
> | **Quick Quiz**:                                                       |
> +-----------------------------------------------------------------------+
> | The first and second guarantees require unbelievably strict ordering! |
> | Are all these memory barriers *really* required?                      |
> +-----------------------------------------------------------------------+
> | **Answer**:                                                           |
> +-----------------------------------------------------------------------+
> | Yes, they really are required. To see why the first guarantee is      |
> | required, consider the following sequence of events:                  |
> |                                                                       |
> | #. CPU 1: rcu_read_lock()                                         |
> | #. CPU 1: ``q = rcu_dereference(gp); /* Very likely to return p. */`` |
> | #. CPU 0: ``list_del_rcu(p);``                                        |
> | #. CPU 0: synchronize_rcu() starts.                               |
> | #. CPU 1: ``do_something_with(q->a);``                                |
> |    ``/* No smp_mb(), so might happen after kfree(). */``              |
> | #. CPU 1: rcu_read_unlock()                                       |
> | #. CPU 0: synchronize_rcu() returns.                              |
> | #. CPU 0: ``kfree(p);``                                               |
> |                                                                       |
> | Therefore, there absolutely must be a full memory barrier between the |
> | end of the RCU read-side critical section and the end of the grace    |
> | period.                                                               |
> |                                                                       |
> | The sequence of events demonstrating the necessity of the second rule |
> | is roughly similar:                                                   |
> |                                                                       |
> | #. CPU 0: ``list_del_rcu(p);``                                        |
> | #. CPU 0: synchronize_rcu() starts.                               |
> | #. CPU 1: rcu_read_lock()                                         |
> | #. CPU 1: ``q = rcu_dereference(gp);``                                |
> |    ``/* Might return p if no memory barrier. */``                     |
> | #. CPU 0: synchronize_rcu() returns.                              |
> | #. CPU 0: ``kfree(p);``                                               |
> | #. CPU 1: ``do_something_with(q->a); /* Boom!!! */``                  |
> | #. CPU 1: rcu_read_unlock()                                       |
> |                                                                       |
> | And similarly, without a memory barrier between the beginning of the  |
> | grace period and the beginning of the RCU read-side critical section, |
> | CPU 1 might end up accessing the freelist.                            |
> |                                                                       |
> | The “as if” rule of course applies, so that any implementation that   |
> | acts as if the appropriate memory barriers were in place is a correct |
> | implementation. That said, it is much easier to fool yourself into    |
> | believing that you have adhered to the as-if rule than it is to       |
> | actually adhere to it!                                                |
> +-----------------------------------------------------------------------+
> Documentation/RCU/Design/Requirements/Requirements.rst:597: WARNING: Malformed table.
> 
> +-----------------------------------------------------------------------+
> | **Quick Quiz**:                                                       |
> +-----------------------------------------------------------------------+
> | You claim that rcu_read_lock() and rcu_read_unlock() generate |
> | absolutely no code in some kernel builds. This means that the         |
> | compiler might arbitrarily rearrange consecutive RCU read-side        |
> | critical sections. Given such rearrangement, if a given RCU read-side |
> | critical section is done, how can you be sure that all prior RCU      |
> | read-side critical sections are done? Won't the compiler              |
> | rearrangements make that impossible to determine?                     |
> +-----------------------------------------------------------------------+
> | **Answer**:                                                           |
> +-----------------------------------------------------------------------+
> | In cases where rcu_read_lock() and rcu_read_unlock() generate |
> | absolutely no code, RCU infers quiescent states only at special       |
> | locations, for example, within the scheduler. Because calls to        |
> | schedule() had better prevent calling-code accesses to shared     |
> | variables from being rearranged across the call to schedule(), if |
> | RCU detects the end of a given RCU read-side critical section, it     |
> | will necessarily detect the end of all prior RCU read-side critical   |
> | sections, no matter how aggressively the compiler scrambles the code. |
> | Again, this all assumes that the compiler cannot scramble code across |
> | calls to the scheduler, out of interrupt handlers, into the idle      |
> | loop, into user-mode code, and so on. But if your kernel build allows |
> | that sort of scrambling, you have broken far more than just RCU!      |
> +-----------------------------------------------------------------------+
> Documentation/RCU/Design/Requirements/Requirements.rst:738: WARNING: Malformed table.
> 
> +-----------------------------------------------------------------------+
> | **Quick Quiz**:                                                       |
> +-----------------------------------------------------------------------+
> | Can't the compiler also reorder this code?                            |
> +-----------------------------------------------------------------------+
> | **Answer**:                                                           |
> +-----------------------------------------------------------------------+
> | No, the volatile casts in READ_ONCE() and WRITE_ONCE()        |
> | prevent the compiler from reordering in this particular case.         |
> +-----------------------------------------------------------------------+
> Documentation/RCU/Design/Requirements/Requirements.rst:793: WARNING: Malformed table.
> 
> +-----------------------------------------------------------------------+
> | **Quick Quiz**:                                                       |
> +-----------------------------------------------------------------------+
> | Suppose that synchronize_rcu() did wait until *all* readers had       |
> | completed instead of waiting only on pre-existing readers. For how    |
> | long would the updater be able to rely on there being no readers?     |
> +-----------------------------------------------------------------------+
> | **Answer**:                                                           |
> +-----------------------------------------------------------------------+
> | For no time at all. Even if synchronize_rcu() were to wait until  |
> | all readers had completed, a new reader might start immediately after |
> | synchronize_rcu() completed. Therefore, the code following        |
> | synchronize_rcu() can *never* rely on there being no readers.     |
> +-----------------------------------------------------------------------+
> Documentation/RCU/Design/Requirements/Requirements.rst:1087: WARNING: Malformed table.
> 
> +-----------------------------------------------------------------------+
> | **Quick Quiz**:                                                       |
> +-----------------------------------------------------------------------+
> | What about sleeping locks?                                            |
> +-----------------------------------------------------------------------+
> | **Answer**:                                                           |
> +-----------------------------------------------------------------------+
> | These are forbidden within Linux-kernel RCU read-side critical        |
> | sections because it is not legal to place a quiescent state (in this  |
> | case, voluntary context switch) within an RCU read-side critical      |
> | section. However, sleeping locks may be used within userspace RCU     |
> | read-side critical sections, and also within Linux-kernel sleepable   |
> | RCU `(SRCU) <#Sleepable%20RCU>`__ read-side critical sections. In     |
> | addition, the -rt patchset turns spinlocks into a sleeping locks so   |
> | that the corresponding critical sections can be preempted, which also |
> | means that these sleeplockified spinlocks (but not other sleeping     |
> | locks!) may be acquire within -rt-Linux-kernel RCU read-side critical |
> | sections.                                                             |
> | Note that it *is* legal for a normal RCU read-side critical section   |
> | to conditionally acquire a sleeping locks (as in                      |
> | mutex_trylock()), but only as long as it does not loop            |
> | indefinitely attempting to conditionally acquire that sleeping locks. |
> | The key point is that things like mutex_trylock() either return   |
> | with the mutex held, or return an error indication if the mutex was   |
> | not immediately available. Either way, mutex_trylock() returns    |
> | immediately without sleeping.                                         |
> +-----------------------------------------------------------------------+
> Documentation/RCU/Design/Requirements/Requirements.rst:1295: WARNING: Malformed table.
> 
> +-----------------------------------------------------------------------+
> | **Quick Quiz**:                                                       |
> +-----------------------------------------------------------------------+
> | Why does line 19 use rcu_access_pointer()? After all,             |
> | call_rcu() on line 25 stores into the structure, which would      |
> | interact badly with concurrent insertions. Doesn't this mean that     |
> | rcu_dereference() is required?                                    |
> +-----------------------------------------------------------------------+
> | **Answer**:                                                           |
> +-----------------------------------------------------------------------+
> | Presumably the ``->gp_lock`` acquired on line 18 excludes any         |
> | changes, including any insertions that rcu_dereference() would    |
> | protect against. Therefore, any insertions will be delayed until      |
> | after ``->gp_lock`` is released on line 25, which in turn means that  |
> | rcu_access_pointer() suffices.                                    |
> +-----------------------------------------------------------------------+
> Documentation/RCU/Design/Requirements/Requirements.rst:1351: WARNING: Malformed table.
> 
> +-----------------------------------------------------------------------+
> | **Quick Quiz**:                                                       |
> +-----------------------------------------------------------------------+
> | Earlier it was claimed that call_rcu() and kfree_rcu()        |
> | allowed updaters to avoid being blocked by readers. But how can that  |
> | be correct, given that the invocation of the callback and the freeing |
> | of the memory (respectively) must still wait for a grace period to    |
> | elapse?                                                               |
> +-----------------------------------------------------------------------+
> | **Answer**:                                                           |
> +-----------------------------------------------------------------------+
> | We could define things this way, but keep in mind that this sort of   |
> | definition would say that updates in garbage-collected languages      |
> | cannot complete until the next time the garbage collector runs, which |
> | does not seem at all reasonable. The key point is that in most cases, |
> | an updater using either call_rcu() or kfree_rcu() can proceed |
> | to the next update as soon as it has invoked call_rcu() or        |
> | kfree_rcu(), without having to wait for a subsequent grace        |
> | period.                                                               |
> +-----------------------------------------------------------------------+
> Documentation/RCU/Design/Requirements/Requirements.rst:1893: WARNING: Malformed table.
> 
> +-----------------------------------------------------------------------+
> | **Quick Quiz**:                                                       |
> +-----------------------------------------------------------------------+
> | Wait a minute! Each RCU callbacks must wait for a grace period to     |
> | complete, and rcu_barrier() must wait for each pre-existing       |
> | callback to be invoked. Doesn't rcu_barrier() therefore need to   |
> | wait for a full grace period if there is even one callback posted     |
> | anywhere in the system?                                               |
> +-----------------------------------------------------------------------+
> | **Answer**:                                                           |
> +-----------------------------------------------------------------------+
> | Absolutely not!!!                                                     |
> | Yes, each RCU callbacks must wait for a grace period to complete, but |
> | it might well be partly (or even completely) finished waiting by the  |
> | time rcu_barrier() is invoked. In that case, rcu_barrier()    |
> | need only wait for the remaining portion of the grace period to       |
> | elapse. So even if there are quite a few callbacks posted,            |
> | rcu_barrier() might well return quite quickly.                    |
> |                                                                       |
> | So if you need to wait for a grace period as well as for all          |
> | pre-existing callbacks, you will need to invoke both                  |
> | synchronize_rcu() and rcu_barrier(). If latency is a concern, |
> | you can always use workqueues to invoke them concurrently.            |
> +-----------------------------------------------------------------------+
> Documentation/RCU/Design/Requirements/Requirements.rst:2220: WARNING: Malformed table.
> 
> +-----------------------------------------------------------------------+
> | **Quick Quiz**:                                                       |
> +-----------------------------------------------------------------------+
> | But what if my driver has a hardware interrupt handler that can run   |
> | for many seconds? I cannot invoke schedule() from an hardware     |
> | interrupt handler, after all!                                         |
> +-----------------------------------------------------------------------+
> | **Answer**:                                                           |
> +-----------------------------------------------------------------------+
> | One approach is to do ``rcu_irq_exit();rcu_irq_enter();`` every so    |
> | often. But given that long-running interrupt handlers can cause other |
> | problems, not least for response time, shouldn't you work to keep     |
> | your interrupt handler's runtime within reasonable bounds?            |
> +-----------------------------------------------------------------------+
> 
> Introduced by commit
> 
>   c0a41bf9dbc7 ("docs: Remove redundant "``" from Requirements.rst")
> 
> -- 
> Cheers,
> Stephen Rothwell



^ permalink raw reply	[flat|nested] 8+ messages in thread

* linux-next: build warnings after merge of the rcu tree
@ 2020-11-10  6:59 Stephen Rothwell
  2020-11-10 15:21 ` Paul E. McKenney
  0 siblings, 1 reply; 8+ messages in thread
From: Stephen Rothwell @ 2020-11-10  6:59 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Linux Kernel Mailing List, Linux Next Mailing List,
	Peter Zijlstra, Mauro Carvalho Chehab, Jonathan Corbet

[-- Attachment #1: Type: text/plain, Size: 23524 bytes --]

Hi all,

After merging the rcu tree, today's linux-next build (htmldocs)
produced these warnings:

Documentation/RCU/Design/Requirements/Requirements.rst:119: WARNING: Malformed table.

+-----------------------------------------------------------------------+
| **Quick Quiz**:                                                       |
+-----------------------------------------------------------------------+
| Wait a minute! You said that updaters can make useful forward         |
| progress concurrently with readers, but pre-existing readers will     |
| block synchronize_rcu()!!!                                        |
| Just who are you trying to fool???                                    |
+-----------------------------------------------------------------------+
| **Answer**:                                                           |
+-----------------------------------------------------------------------+
| First, if updaters do not wish to be blocked by readers, they can use |
| call_rcu() or kfree_rcu(), which will be discussed later.     |
| Second, even when using synchronize_rcu(), the other update-side  |
| code does run concurrently with readers, whether pre-existing or not. |
+-----------------------------------------------------------------------+
Documentation/RCU/Design/Requirements/Requirements.rst:178: WARNING: Malformed table.

+-----------------------------------------------------------------------+
| **Quick Quiz**:                                                       |
+-----------------------------------------------------------------------+
| Why is the synchronize_rcu() on line 28 needed?                   |
+-----------------------------------------------------------------------+
| **Answer**:                                                           |
+-----------------------------------------------------------------------+
| Without that extra grace period, memory reordering could result in    |
| do_something_dlm() executing do_something() concurrently with |
| the last bits of recovery().                                      |
+-----------------------------------------------------------------------+
Documentation/RCU/Design/Requirements/Requirements.rst:289: WARNING: Malformed table.

+-----------------------------------------------------------------------+
| **Quick Quiz**:                                                       |
+-----------------------------------------------------------------------+
| But rcu_assign_pointer() does nothing to prevent the two          |
| assignments to ``p->a`` and ``p->b`` from being reordered. Can't that |
| also cause problems?                                                  |
+-----------------------------------------------------------------------+
| **Answer**:                                                           |
+-----------------------------------------------------------------------+
| No, it cannot. The readers cannot see either of these two fields      |
| until the assignment to ``gp``, by which time both fields are fully   |
| initialized. So reordering the assignments to ``p->a`` and ``p->b``   |
| cannot possibly cause any problems.                                   |
+-----------------------------------------------------------------------+
Documentation/RCU/Design/Requirements/Requirements.rst:430: WARNING: Malformed table.

+-----------------------------------------------------------------------+
| **Quick Quiz**:                                                       |
+-----------------------------------------------------------------------+
| Without the rcu_dereference() or the rcu_access_pointer(),    |
| what destructive optimizations might the compiler make use of?        |
+-----------------------------------------------------------------------+
| **Answer**:                                                           |
+-----------------------------------------------------------------------+
| Let's start with what happens to do_something_gp() if it fails to |
| use rcu_dereference(). It could reuse a value formerly fetched    |
| from this same pointer. It could also fetch the pointer from ``gp``   |
| in a byte-at-a-time manner, resulting in *load tearing*, in turn      |
| resulting a bytewise mash-up of two distinct pointer values. It might |
| even use value-speculation optimizations, where it makes a wrong      |
| guess, but by the time it gets around to checking the value, an       |
| update has changed the pointer to match the wrong guess. Too bad      |
| about any dereferences that returned pre-initialization garbage in    |
| the meantime!                                                         |
| For remove_gp_synchronous(), as long as all modifications to      |
| ``gp`` are carried out while holding ``gp_lock``, the above           |
| optimizations are harmless. However, ``sparse`` will complain if you  |
| define ``gp`` with ``__rcu`` and then access it without using either  |
| rcu_access_pointer() or rcu_dereference().                    |
+-----------------------------------------------------------------------+
Documentation/RCU/Design/Requirements/Requirements.rst:513: WARNING: Malformed table.

+-----------------------------------------------------------------------+
| **Quick Quiz**:                                                       |
+-----------------------------------------------------------------------+
| Given that multiple CPUs can start RCU read-side critical sections at |
| any time without any ordering whatsoever, how can RCU possibly tell   |
| whether or not a given RCU read-side critical section starts before a |
| given instance of synchronize_rcu()?                              |
+-----------------------------------------------------------------------+
| **Answer**:                                                           |
+-----------------------------------------------------------------------+
| If RCU cannot tell whether or not a given RCU read-side critical      |
| section starts before a given instance of synchronize_rcu(), then |
| it must assume that the RCU read-side critical section started first. |
| In other words, a given instance of synchronize_rcu() can avoid   |
| waiting on a given RCU read-side critical section only if it can      |
| prove that synchronize_rcu() started first.                       |
| A related question is “When rcu_read_lock() doesn't generate any  |
| code, why does it matter how it relates to a grace period?” The       |
| answer is that it is not the relationship of rcu_read_lock()      |
| itself that is important, but rather the relationship of the code     |
| within the enclosed RCU read-side critical section to the code        |
| preceding and following the grace period. If we take this viewpoint,  |
| then a given RCU read-side critical section begins before a given     |
| grace period when some access preceding the grace period observes the |
| effect of some access within the critical section, in which case none |
| of the accesses within the critical section may observe the effects   |
| of any access following the grace period.                             |
|                                                                       |
| As of late 2016, mathematical models of RCU take this viewpoint, for  |
| example, see slides 62 and 63 of the `2016 LinuxCon                   |
| EU <http://www2.rdrop.com/users/paulmck/scalability/paper/LinuxMM.201 |
| 6.10.04c.LCE.pdf>`__                                                  |
| presentation.                                                         |
+-----------------------------------------------------------------------+
Documentation/RCU/Design/Requirements/Requirements.rst:548: WARNING: Malformed table.

+-----------------------------------------------------------------------+
| **Quick Quiz**:                                                       |
+-----------------------------------------------------------------------+
| The first and second guarantees require unbelievably strict ordering! |
| Are all these memory barriers *really* required?                      |
+-----------------------------------------------------------------------+
| **Answer**:                                                           |
+-----------------------------------------------------------------------+
| Yes, they really are required. To see why the first guarantee is      |
| required, consider the following sequence of events:                  |
|                                                                       |
| #. CPU 1: rcu_read_lock()                                         |
| #. CPU 1: ``q = rcu_dereference(gp); /* Very likely to return p. */`` |
| #. CPU 0: ``list_del_rcu(p);``                                        |
| #. CPU 0: synchronize_rcu() starts.                               |
| #. CPU 1: ``do_something_with(q->a);``                                |
|    ``/* No smp_mb(), so might happen after kfree(). */``              |
| #. CPU 1: rcu_read_unlock()                                       |
| #. CPU 0: synchronize_rcu() returns.                              |
| #. CPU 0: ``kfree(p);``                                               |
|                                                                       |
| Therefore, there absolutely must be a full memory barrier between the |
| end of the RCU read-side critical section and the end of the grace    |
| period.                                                               |
|                                                                       |
| The sequence of events demonstrating the necessity of the second rule |
| is roughly similar:                                                   |
|                                                                       |
| #. CPU 0: ``list_del_rcu(p);``                                        |
| #. CPU 0: synchronize_rcu() starts.                               |
| #. CPU 1: rcu_read_lock()                                         |
| #. CPU 1: ``q = rcu_dereference(gp);``                                |
|    ``/* Might return p if no memory barrier. */``                     |
| #. CPU 0: synchronize_rcu() returns.                              |
| #. CPU 0: ``kfree(p);``                                               |
| #. CPU 1: ``do_something_with(q->a); /* Boom!!! */``                  |
| #. CPU 1: rcu_read_unlock()                                       |
|                                                                       |
| And similarly, without a memory barrier between the beginning of the  |
| grace period and the beginning of the RCU read-side critical section, |
| CPU 1 might end up accessing the freelist.                            |
|                                                                       |
| The “as if” rule of course applies, so that any implementation that   |
| acts as if the appropriate memory barriers were in place is a correct |
| implementation. That said, it is much easier to fool yourself into    |
| believing that you have adhered to the as-if rule than it is to       |
| actually adhere to it!                                                |
+-----------------------------------------------------------------------+
Documentation/RCU/Design/Requirements/Requirements.rst:597: WARNING: Malformed table.

+-----------------------------------------------------------------------+
| **Quick Quiz**:                                                       |
+-----------------------------------------------------------------------+
| You claim that rcu_read_lock() and rcu_read_unlock() generate |
| absolutely no code in some kernel builds. This means that the         |
| compiler might arbitrarily rearrange consecutive RCU read-side        |
| critical sections. Given such rearrangement, if a given RCU read-side |
| critical section is done, how can you be sure that all prior RCU      |
| read-side critical sections are done? Won't the compiler              |
| rearrangements make that impossible to determine?                     |
+-----------------------------------------------------------------------+
| **Answer**:                                                           |
+-----------------------------------------------------------------------+
| In cases where rcu_read_lock() and rcu_read_unlock() generate |
| absolutely no code, RCU infers quiescent states only at special       |
| locations, for example, within the scheduler. Because calls to        |
| schedule() had better prevent calling-code accesses to shared     |
| variables from being rearranged across the call to schedule(), if |
| RCU detects the end of a given RCU read-side critical section, it     |
| will necessarily detect the end of all prior RCU read-side critical   |
| sections, no matter how aggressively the compiler scrambles the code. |
| Again, this all assumes that the compiler cannot scramble code across |
| calls to the scheduler, out of interrupt handlers, into the idle      |
| loop, into user-mode code, and so on. But if your kernel build allows |
| that sort of scrambling, you have broken far more than just RCU!      |
+-----------------------------------------------------------------------+
Documentation/RCU/Design/Requirements/Requirements.rst:738: WARNING: Malformed table.

+-----------------------------------------------------------------------+
| **Quick Quiz**:                                                       |
+-----------------------------------------------------------------------+
| Can't the compiler also reorder this code?                            |
+-----------------------------------------------------------------------+
| **Answer**:                                                           |
+-----------------------------------------------------------------------+
| No, the volatile casts in READ_ONCE() and WRITE_ONCE()        |
| prevent the compiler from reordering in this particular case.         |
+-----------------------------------------------------------------------+
Documentation/RCU/Design/Requirements/Requirements.rst:793: WARNING: Malformed table.

+-----------------------------------------------------------------------+
| **Quick Quiz**:                                                       |
+-----------------------------------------------------------------------+
| Suppose that synchronize_rcu() did wait until *all* readers had       |
| completed instead of waiting only on pre-existing readers. For how    |
| long would the updater be able to rely on there being no readers?     |
+-----------------------------------------------------------------------+
| **Answer**:                                                           |
+-----------------------------------------------------------------------+
| For no time at all. Even if synchronize_rcu() were to wait until  |
| all readers had completed, a new reader might start immediately after |
| synchronize_rcu() completed. Therefore, the code following        |
| synchronize_rcu() can *never* rely on there being no readers.     |
+-----------------------------------------------------------------------+
Documentation/RCU/Design/Requirements/Requirements.rst:1087: WARNING: Malformed table.

+-----------------------------------------------------------------------+
| **Quick Quiz**:                                                       |
+-----------------------------------------------------------------------+
| What about sleeping locks?                                            |
+-----------------------------------------------------------------------+
| **Answer**:                                                           |
+-----------------------------------------------------------------------+
| These are forbidden within Linux-kernel RCU read-side critical        |
| sections because it is not legal to place a quiescent state (in this  |
| case, voluntary context switch) within an RCU read-side critical      |
| section. However, sleeping locks may be used within userspace RCU     |
| read-side critical sections, and also within Linux-kernel sleepable   |
| RCU `(SRCU) <#Sleepable%20RCU>`__ read-side critical sections. In     |
| addition, the -rt patchset turns spinlocks into a sleeping locks so   |
| that the corresponding critical sections can be preempted, which also |
| means that these sleeplockified spinlocks (but not other sleeping     |
| locks!) may be acquire within -rt-Linux-kernel RCU read-side critical |
| sections.                                                             |
| Note that it *is* legal for a normal RCU read-side critical section   |
| to conditionally acquire a sleeping locks (as in                      |
| mutex_trylock()), but only as long as it does not loop            |
| indefinitely attempting to conditionally acquire that sleeping locks. |
| The key point is that things like mutex_trylock() either return   |
| with the mutex held, or return an error indication if the mutex was   |
| not immediately available. Either way, mutex_trylock() returns    |
| immediately without sleeping.                                         |
+-----------------------------------------------------------------------+
Documentation/RCU/Design/Requirements/Requirements.rst:1295: WARNING: Malformed table.

+-----------------------------------------------------------------------+
| **Quick Quiz**:                                                       |
+-----------------------------------------------------------------------+
| Why does line 19 use rcu_access_pointer()? After all,             |
| call_rcu() on line 25 stores into the structure, which would      |
| interact badly with concurrent insertions. Doesn't this mean that     |
| rcu_dereference() is required?                                    |
+-----------------------------------------------------------------------+
| **Answer**:                                                           |
+-----------------------------------------------------------------------+
| Presumably the ``->gp_lock`` acquired on line 18 excludes any         |
| changes, including any insertions that rcu_dereference() would    |
| protect against. Therefore, any insertions will be delayed until      |
| after ``->gp_lock`` is released on line 25, which in turn means that  |
| rcu_access_pointer() suffices.                                    |
+-----------------------------------------------------------------------+
Documentation/RCU/Design/Requirements/Requirements.rst:1351: WARNING: Malformed table.

+-----------------------------------------------------------------------+
| **Quick Quiz**:                                                       |
+-----------------------------------------------------------------------+
| Earlier it was claimed that call_rcu() and kfree_rcu()        |
| allowed updaters to avoid being blocked by readers. But how can that  |
| be correct, given that the invocation of the callback and the freeing |
| of the memory (respectively) must still wait for a grace period to    |
| elapse?                                                               |
+-----------------------------------------------------------------------+
| **Answer**:                                                           |
+-----------------------------------------------------------------------+
| We could define things this way, but keep in mind that this sort of   |
| definition would say that updates in garbage-collected languages      |
| cannot complete until the next time the garbage collector runs, which |
| does not seem at all reasonable. The key point is that in most cases, |
| an updater using either call_rcu() or kfree_rcu() can proceed |
| to the next update as soon as it has invoked call_rcu() or        |
| kfree_rcu(), without having to wait for a subsequent grace        |
| period.                                                               |
+-----------------------------------------------------------------------+
Documentation/RCU/Design/Requirements/Requirements.rst:1893: WARNING: Malformed table.

+-----------------------------------------------------------------------+
| **Quick Quiz**:                                                       |
+-----------------------------------------------------------------------+
| Wait a minute! Each RCU callbacks must wait for a grace period to     |
| complete, and rcu_barrier() must wait for each pre-existing       |
| callback to be invoked. Doesn't rcu_barrier() therefore need to   |
| wait for a full grace period if there is even one callback posted     |
| anywhere in the system?                                               |
+-----------------------------------------------------------------------+
| **Answer**:                                                           |
+-----------------------------------------------------------------------+
| Absolutely not!!!                                                     |
| Yes, each RCU callbacks must wait for a grace period to complete, but |
| it might well be partly (or even completely) finished waiting by the  |
| time rcu_barrier() is invoked. In that case, rcu_barrier()    |
| need only wait for the remaining portion of the grace period to       |
| elapse. So even if there are quite a few callbacks posted,            |
| rcu_barrier() might well return quite quickly.                    |
|                                                                       |
| So if you need to wait for a grace period as well as for all          |
| pre-existing callbacks, you will need to invoke both                  |
| synchronize_rcu() and rcu_barrier(). If latency is a concern, |
| you can always use workqueues to invoke them concurrently.            |
+-----------------------------------------------------------------------+
Documentation/RCU/Design/Requirements/Requirements.rst:2220: WARNING: Malformed table.

+-----------------------------------------------------------------------+
| **Quick Quiz**:                                                       |
+-----------------------------------------------------------------------+
| But what if my driver has a hardware interrupt handler that can run   |
| for many seconds? I cannot invoke schedule() from an hardware     |
| interrupt handler, after all!                                         |
+-----------------------------------------------------------------------+
| **Answer**:                                                           |
+-----------------------------------------------------------------------+
| One approach is to do ``rcu_irq_exit();rcu_irq_enter();`` every so    |
| often. But given that long-running interrupt handlers can cause other |
| problems, not least for response time, shouldn't you work to keep     |
| your interrupt handler's runtime within reasonable bounds?            |
+-----------------------------------------------------------------------+

Introduced by commit

  c0a41bf9dbc7 ("docs: Remove redundant "``" from Requirements.rst")

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-11-23 14:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-23  5:32 linux-next: build warnings after merge of the rcu tree Stephen Rothwell
2022-11-23  8:19 ` Leizhen (ThunderTown)
2022-11-23 14:27   ` Paul E. McKenney
  -- strict thread matches above, loose matches on Subject: below --
2020-11-26  6:44 Stephen Rothwell
2020-11-26 15:12 ` Paul E. McKenney
2020-11-26 17:28   ` Uladzislau Rezki
2020-11-10  6:59 Stephen Rothwell
2020-11-10 15:21 ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).