* [PATCH 1/2] count: Employ new scheme for snippet from count_stat.c @ 2018-10-02 15:15 Akira Yokosawa 2018-10-02 15:16 ` [PATCH 2/2] count: Adjust type of variable 'counter' with code snippet Akira Yokosawa 0 siblings, 1 reply; 6+ messages in thread From: Akira Yokosawa @ 2018-10-02 15:15 UTC (permalink / raw) To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa From 047e2e2aa09633f5605111119e2f45fd43f6febc Mon Sep 17 00:00:00 2001 From: Akira Yokosawa <akiyks@gmail.com> Date: Tue, 2 Oct 2018 23:54:33 +0900 Subject: [PATCH 1/2] count: Employ new scheme for snippet from count_stat.c Also add necessary READ_ONCE()/WRITE_ONCE() as par. the rule of where to use them provided by Paul as an outline on perfbook mail-list [1]. [1]: https://www.spinics.net/lists/perfbook/msg01788.html Suggested by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Akira Yokosawa <akiyks@gmail.com> --- CodeSamples/count/count_stat.c | 22 +++++++++++++--------- count/count.tex | 33 ++++++++------------------------- 2 files changed, 21 insertions(+), 34 deletions(-) diff --git a/CodeSamples/count/count_stat.c b/CodeSamples/count/count_stat.c index 1d72f99..0675320 100644 --- a/CodeSamples/count/count_stat.c +++ b/CodeSamples/count/count_stat.c @@ -20,28 +20,32 @@ #include "../api.h" -DEFINE_PER_THREAD(unsigned long, counter); +//\begin{snippet}[labelbase=ln:count:count_stat:inc-read,commandchars=\\\[\]] +DEFINE_PER_THREAD(unsigned long, counter); //\lnlbl{define} -void inc_count(void) +static __inline__ void inc_count(void) //\lnlbl{inc:b} { - __get_thread_var(counter)++; -} + unsigned long *p_counter = &__get_thread_var(counter); + + WRITE_ONCE(*p_counter, *p_counter + 1); +} //\lnlbl{inc:e} -__inline__ unsigned long read_count(void) +static __inline__ unsigned long read_count(void) //\lnlbl{read:b} { int t; unsigned long sum = 0; for_each_thread(t) - sum += per_thread(counter, t); + sum += READ_ONCE(per_thread(counter, t)); return sum; -} +} //\lnlbl{read:e} +//\end{snippet} -__inline__ void count_init(void) +void count_init(void) { } -__inline__ void count_cleanup(void) +void count_cleanup(void) { } diff --git a/count/count.tex b/count/count.tex index a7d4c9d..4f760a4 100644 --- a/count/count.tex +++ b/count/count.tex @@ -477,28 +477,7 @@ thread (presumably cache aligned and padded to avoid false sharing). } \QuickQuizEnd \begin{listing}[tbp] -{ \scriptsize -\begin{verbbox} - 1 DEFINE_PER_THREAD(long, counter); - 2 - 3 void inc_count(void) - 4 { - 5 __get_thread_var(counter)++; - 6 } - 7 - 8 long read_count(void) - 9 { - 10 int t; - 11 long sum = 0; - 12 - 13 for_each_thread(t) - 14 sum += per_thread(counter, t); - 15 return sum; - 16 } -\end{verbbox} -} -\centering -\theverbbox +\input{CodeSamples/count/count_stat@inc-read.fcv} \caption{Array-Based Per-Thread Statistical Counters} \label{lst:count:Array-Based Per-Thread Statistical Counters} \end{listing} @@ -506,22 +485,26 @@ thread (presumably cache aligned and padded to avoid false sharing). Such an array can be wrapped into per-thread primitives, as shown in Listing~\ref{lst:count:Array-Based Per-Thread Statistical Counters} (\path{count_stat.c}). -Line~1 defines an array containing a set of per-thread counters of +\begin{lineref}[ln:count:count_stat:inc-read] +Line~\lnref{define} defines an array containing a set of per-thread counters of type \co{long} named, creatively enough, \co{counter}. -Lines~3-6 show a function that increments the counters, using the +Lines~\lnref{inc:b}-\lnref{inc:e} +show a function that increments the counters, using the \co{__get_thread_var()} primitive to locate the currently running thread's element of the \co{counter} array. Because this element is modified only by the corresponding thread, non-atomic increment suffices. -Lines~8-16 show a function that reads out the aggregate value of the counter, +Lines~\lnref{read:b}-\lnref{read:e} +show a function that reads out the aggregate value of the counter, using the \co{for_each_thread()} primitive to iterate over the list of currently running threads, and using the \co{per_thread()} primitive to fetch the specified thread's counter. Because the hardware can fetch and store a properly aligned \co{long} atomically, and because \GCC\ is kind enough to make use of this capability, normal loads suffice, and no special atomic instructions are required. +\end{lineref} \QuickQuiz{} What other choice does \GCC\ have, anyway??? -- 2.7.4 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] count: Adjust type of variable 'counter' with code snippet 2018-10-02 15:15 [PATCH 1/2] count: Employ new scheme for snippet from count_stat.c Akira Yokosawa @ 2018-10-02 15:16 ` Akira Yokosawa 2018-10-02 15:24 ` Akira Yokosawa 0 siblings, 1 reply; 6+ messages in thread From: Akira Yokosawa @ 2018-10-02 15:16 UTC (permalink / raw) To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa From a9c276c3da87ed327d003a70d60777f41999b4fc Mon Sep 17 00:00:00 2001 From: Akira Yokosawa <akiyks@gmail.com> Date: Wed, 3 Oct 2018 00:08:49 +0900 Subject: [PATCH 2/2] count: Adjust type of variable 'counter' with code snippet In the actual code of count_stat.c, the per-thread variable "counter" has the type "unsigned long". Signed-off-by: Akira Yokosawa <akiyks@gmail.com> --- count/count.tex | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/count/count.tex b/count/count.tex index 4f760a4..9f9fd0f 100644 --- a/count/count.tex +++ b/count/count.tex @@ -487,7 +487,7 @@ Listing~\ref{lst:count:Array-Based Per-Thread Statistical Counters} (\path{count_stat.c}). \begin{lineref}[ln:count:count_stat:inc-read] Line~\lnref{define} defines an array containing a set of per-thread counters of -type \co{long} named, creatively enough, \co{counter}. +type \co{unsigned long} named, creatively enough, \co{counter}. Lines~\lnref{inc:b}-\lnref{inc:e} show a function that increments the counters, using the @@ -501,7 +501,7 @@ show a function that reads out the aggregate value of the counter, using the \co{for_each_thread()} primitive to iterate over the list of currently running threads, and using the \co{per_thread()} primitive to fetch the specified thread's counter. -Because the hardware can fetch and store a properly aligned \co{long} +Because the hardware can fetch and store a properly aligned \co{unsigned long} atomically, and because \GCC\ is kind enough to make use of this capability, normal loads suffice, and no special atomic instructions are required. \end{lineref} -- 2.7.4 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] count: Adjust type of variable 'counter' with code snippet 2018-10-02 15:16 ` [PATCH 2/2] count: Adjust type of variable 'counter' with code snippet Akira Yokosawa @ 2018-10-02 15:24 ` Akira Yokosawa 2018-10-02 18:21 ` Paul E. McKenney 0 siblings, 1 reply; 6+ messages in thread From: Akira Yokosawa @ 2018-10-02 15:24 UTC (permalink / raw) To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa On 2018/10/03 00:16:12 +0900, Akira Yokosawa wrote: > From a9c276c3da87ed327d003a70d60777f41999b4fc Mon Sep 17 00:00:00 2001 > From: Akira Yokosawa <akiyks@gmail.com> > Date: Wed, 3 Oct 2018 00:08:49 +0900 > Subject: [PATCH 2/2] count: Adjust type of variable 'counter' with code snippet > > In the actual code of count_stat.c, the per-thread variable > "counter" has the type "unsigned long". > And the other description on Listing 5.3 in the text needs some reworking to reflect the changes in the snippet. Paul, can you have a look into this? Thanks, Akira > Signed-off-by: Akira Yokosawa <akiyks@gmail.com> > --- > count/count.tex | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/count/count.tex b/count/count.tex > index 4f760a4..9f9fd0f 100644 > --- a/count/count.tex > +++ b/count/count.tex > @@ -487,7 +487,7 @@ Listing~\ref{lst:count:Array-Based Per-Thread Statistical Counters} > (\path{count_stat.c}). > \begin{lineref}[ln:count:count_stat:inc-read] > Line~\lnref{define} defines an array containing a set of per-thread counters of > -type \co{long} named, creatively enough, \co{counter}. > +type \co{unsigned long} named, creatively enough, \co{counter}. > > Lines~\lnref{inc:b}-\lnref{inc:e} > show a function that increments the counters, using the > @@ -501,7 +501,7 @@ show a function that reads out the aggregate value of the counter, > using the \co{for_each_thread()} primitive to iterate over the list of > currently running threads, and using the \co{per_thread()} primitive > to fetch the specified thread's counter. > -Because the hardware can fetch and store a properly aligned \co{long} > +Because the hardware can fetch and store a properly aligned \co{unsigned long} > atomically, and because \GCC\ is kind enough to make use of this capability, > normal loads suffice, and no special atomic instructions are required. > \end{lineref} > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] count: Adjust type of variable 'counter' with code snippet 2018-10-02 15:24 ` Akira Yokosawa @ 2018-10-02 18:21 ` Paul E. McKenney 2018-10-02 22:12 ` Akira Yokosawa 0 siblings, 1 reply; 6+ messages in thread From: Paul E. McKenney @ 2018-10-02 18:21 UTC (permalink / raw) To: Akira Yokosawa; +Cc: perfbook On Wed, Oct 03, 2018 at 12:24:17AM +0900, Akira Yokosawa wrote: > On 2018/10/03 00:16:12 +0900, Akira Yokosawa wrote: > > From a9c276c3da87ed327d003a70d60777f41999b4fc Mon Sep 17 00:00:00 2001 > > From: Akira Yokosawa <akiyks@gmail.com> > > Date: Wed, 3 Oct 2018 00:08:49 +0900 > > Subject: [PATCH 2/2] count: Adjust type of variable 'counter' with code snippet > > > > In the actual code of count_stat.c, the per-thread variable > > "counter" has the type "unsigned long". Looks good, so I applied and pushed both patches, thank you! > And the other description on Listing 5.3 in the text needs some reworking > to reflect the changes in the snippet. Paul, can you have a look into > this? How does the patch below look? Thanx, Paul ------------------------------------------------------------------------ commit cc0409cfb0581924b05744a15fa3ebd5c1e298af Author: Paul E. McKenney <paulmck@linux.ibm.com> Date: Tue Oct 2 11:19:33 2018 -0700 count: Update code description and QQ based on {READ,WRITE}_ONCE() Reported-by: Akira Yokosawa <akiyks@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> diff --git a/count/count.tex b/count/count.tex index 9f9fd0f64838..c36e7d826a7b 100644 --- a/count/count.tex +++ b/count/count.tex @@ -495,23 +495,22 @@ show a function that increments the counters, using the thread's element of the \co{counter} array. Because this element is modified only by the corresponding thread, non-atomic increment suffices. - -Lines~\lnref{read:b}-\lnref{read:e} -show a function that reads out the aggregate value of the counter, -using the \co{for_each_thread()} primitive to iterate over the list of -currently running threads, and using the \co{per_thread()} primitive -to fetch the specified thread's counter. -Because the hardware can fetch and store a properly aligned \co{unsigned long} -atomically, and because \GCC\ is kind enough to make use of this capability, -normal loads suffice, and no special atomic instructions are required. -\end{lineref} +However, this code uses \co{WRITE_ONCE()} to prevent destructive compiler +optimizations. +For but one example, the compiler is within its rights to use a +to-be-stored-to location as temporary storage, thus writing what +would be for all intents and purposes garbage to that location +just before doing the desired store. +This could of course be rather confusing to anything attempting to +read out the count. +The use of \co{WRITE_ONCE()} prevents this optimization and others besides. \QuickQuiz{} - What other choice does \GCC\ have, anyway??? + What other nasty optimizations could \GCC\ apply? \QuickQuizAnswer{ - According to the C standard, the effects of fetching a variable - that might be concurrently modified by some other thread are - undefined. + According to the C standard, the effects of doing a normal store + to a variable that might be concurrently loaded by some other + thread are undefined. It turns out that the C standard really has no other choice, given that C must support (for example) eight-bit architectures which are incapable of atomically loading a \co{long}. @@ -540,8 +539,10 @@ normal loads suffice, and no special atomic instructions are required. It is therefore reasonable to expect that any Linux-kernel adoption of C11 atomics will be incremental at best. - So why not just use plain old C-language assignment statements - to access shared variables? + But if the code is doing loads and stores that the underlying + hardware can implement with single instructions, why not just + use plain old C-language assignment statements to access shared + variables? The problem is that the compiler is within its rights to assume that no other thread is modifying the variable being accessed. Given a plain old C-language load, the compiler would therefore @@ -581,6 +582,22 @@ normal loads suffice, and no special atomic instructions are required. coding standards. } \QuickQuizEnd +Lines~\lnref{read:b}-\lnref{read:e} +show a function that reads out the aggregate value of the counter, +using the \co{for_each_thread()} primitive to iterate over the list of +currently running threads, and using the \co{per_thread()} primitive +to fetch the specified thread's counter. +This code also uses \co{READ_ONCE()} to ensure that the compiler doesn't +optimize these loads into oblivion. +For but one example, a pair of consecutive calls to \co{read_count()} +might be inlined, and an intrepid optimizer might notice that the same +locations were being summed and thus incorrectly conclude that it would +be simply wonderful to sum them once and use the resulting value twice. +This sort of optimization might be rather frustrating to people expecting +later \co{read_count()} calls to return larger values. +The use of \co{READ_ONCE()} prevents this optimization and others besides. +\end{lineref} + \QuickQuiz{} How does the per-thread \co{counter} variable in Listing~\ref{lst:count:Array-Based Per-Thread Statistical Counters} ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] count: Adjust type of variable 'counter' with code snippet 2018-10-02 18:21 ` Paul E. McKenney @ 2018-10-02 22:12 ` Akira Yokosawa 2018-10-04 3:40 ` Paul E. McKenney 0 siblings, 1 reply; 6+ messages in thread From: Akira Yokosawa @ 2018-10-02 22:12 UTC (permalink / raw) To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa On 2018/10/02 11:21:23 -0700, Paul E. McKenney wrote: > On Wed, Oct 03, 2018 at 12:24:17AM +0900, Akira Yokosawa wrote: >> On 2018/10/03 00:16:12 +0900, Akira Yokosawa wrote: >>> From a9c276c3da87ed327d003a70d60777f41999b4fc Mon Sep 17 00:00:00 2001 >>> From: Akira Yokosawa <akiyks@gmail.com> >>> Date: Wed, 3 Oct 2018 00:08:49 +0900 >>> Subject: [PATCH 2/2] count: Adjust type of variable 'counter' with code snippet >>> >>> In the actual code of count_stat.c, the per-thread variable >>> "counter" has the type "unsigned long". > > Looks good, so I applied and pushed both patches, thank you! > >> And the other description on Listing 5.3 in the text needs some reworking >> to reflect the changes in the snippet. Paul, can you have a look into >> this? > > How does the patch below look? Looks good to me (at least for the moment). Acked-by: Akira Yokosawa <akiyks@gmail.com> And I'm anticipating your expansion on the use of READ_ONCE()/WRITE_ONCE() in, say, somewhere in toolsoftrade. Then we would be able to reduce repetitive explanation here and there. I'll continue to apply new scheme for other snippets in "count" chapter. Thanks, Akira > > Thanx, Paul > > ------------------------------------------------------------------------ > > commit cc0409cfb0581924b05744a15fa3ebd5c1e298af > Author: Paul E. McKenney <paulmck@linux.ibm.com> > Date: Tue Oct 2 11:19:33 2018 -0700 > > count: Update code description and QQ based on {READ,WRITE}_ONCE() > > Reported-by: Akira Yokosawa <akiyks@gmail.com> > Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> > > diff --git a/count/count.tex b/count/count.tex > index 9f9fd0f64838..c36e7d826a7b 100644 > --- a/count/count.tex > +++ b/count/count.tex > @@ -495,23 +495,22 @@ show a function that increments the counters, using the > thread's element of the \co{counter} array. > Because this element is modified only by the corresponding thread, > non-atomic increment suffices. > - > -Lines~\lnref{read:b}-\lnref{read:e} > -show a function that reads out the aggregate value of the counter, > -using the \co{for_each_thread()} primitive to iterate over the list of > -currently running threads, and using the \co{per_thread()} primitive > -to fetch the specified thread's counter. > -Because the hardware can fetch and store a properly aligned \co{unsigned long} > -atomically, and because \GCC\ is kind enough to make use of this capability, > -normal loads suffice, and no special atomic instructions are required. > -\end{lineref} > +However, this code uses \co{WRITE_ONCE()} to prevent destructive compiler > +optimizations. > +For but one example, the compiler is within its rights to use a > +to-be-stored-to location as temporary storage, thus writing what > +would be for all intents and purposes garbage to that location > +just before doing the desired store. > +This could of course be rather confusing to anything attempting to > +read out the count. > +The use of \co{WRITE_ONCE()} prevents this optimization and others besides. > > \QuickQuiz{} > - What other choice does \GCC\ have, anyway??? > + What other nasty optimizations could \GCC\ apply? > \QuickQuizAnswer{ > - According to the C standard, the effects of fetching a variable > - that might be concurrently modified by some other thread are > - undefined. > + According to the C standard, the effects of doing a normal store > + to a variable that might be concurrently loaded by some other > + thread are undefined. > It turns out that the C standard really has no other choice, > given that C must support (for example) eight-bit architectures > which are incapable of atomically loading a \co{long}. > @@ -540,8 +539,10 @@ normal loads suffice, and no special atomic instructions are required. > It is therefore reasonable to expect that any Linux-kernel > adoption of C11 atomics will be incremental at best. > > - So why not just use plain old C-language assignment statements > - to access shared variables? > + But if the code is doing loads and stores that the underlying > + hardware can implement with single instructions, why not just > + use plain old C-language assignment statements to access shared > + variables? > The problem is that the compiler is within its rights to assume > that no other thread is modifying the variable being accessed. > Given a plain old C-language load, the compiler would therefore > @@ -581,6 +582,22 @@ normal loads suffice, and no special atomic instructions are required. > coding standards. > } \QuickQuizEnd > > +Lines~\lnref{read:b}-\lnref{read:e} > +show a function that reads out the aggregate value of the counter, > +using the \co{for_each_thread()} primitive to iterate over the list of > +currently running threads, and using the \co{per_thread()} primitive > +to fetch the specified thread's counter. > +This code also uses \co{READ_ONCE()} to ensure that the compiler doesn't > +optimize these loads into oblivion. > +For but one example, a pair of consecutive calls to \co{read_count()} > +might be inlined, and an intrepid optimizer might notice that the same > +locations were being summed and thus incorrectly conclude that it would > +be simply wonderful to sum them once and use the resulting value twice. > +This sort of optimization might be rather frustrating to people expecting > +later \co{read_count()} calls to return larger values. > +The use of \co{READ_ONCE()} prevents this optimization and others besides. > +\end{lineref} > + > \QuickQuiz{} > How does the per-thread \co{counter} variable in > Listing~\ref{lst:count:Array-Based Per-Thread Statistical Counters} > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] count: Adjust type of variable 'counter' with code snippet 2018-10-02 22:12 ` Akira Yokosawa @ 2018-10-04 3:40 ` Paul E. McKenney 0 siblings, 0 replies; 6+ messages in thread From: Paul E. McKenney @ 2018-10-04 3:40 UTC (permalink / raw) To: Akira Yokosawa; +Cc: perfbook On Wed, Oct 03, 2018 at 07:12:32AM +0900, Akira Yokosawa wrote: > On 2018/10/02 11:21:23 -0700, Paul E. McKenney wrote: > > On Wed, Oct 03, 2018 at 12:24:17AM +0900, Akira Yokosawa wrote: > >> On 2018/10/03 00:16:12 +0900, Akira Yokosawa wrote: > >>> From a9c276c3da87ed327d003a70d60777f41999b4fc Mon Sep 17 00:00:00 2001 > >>> From: Akira Yokosawa <akiyks@gmail.com> > >>> Date: Wed, 3 Oct 2018 00:08:49 +0900 > >>> Subject: [PATCH 2/2] count: Adjust type of variable 'counter' with code snippet > >>> > >>> In the actual code of count_stat.c, the per-thread variable > >>> "counter" has the type "unsigned long". > > > > Looks good, so I applied and pushed both patches, thank you! > > > >> And the other description on Listing 5.3 in the text needs some reworking > >> to reflect the changes in the snippet. Paul, can you have a look into > >> this? > > > > How does the patch below look? > > Looks good to me (at least for the moment). > > Acked-by: Akira Yokosawa <akiyks@gmail.com> > > And I'm anticipating your expansion on the use of READ_ONCE()/WRITE_ONCE() > in, say, somewhere in toolsoftrade. Then we would be able to reduce > repetitive explanation here and there. I am working on this, but there is a surprising amount to say. So it might take some time. On the other hand, with the rework of the memory-ordering chapter, this part of the Tools of the Trade chapter seems to be the weakest part of the book, so it certainly makes sense for me to focus there a bit. > I'll continue to apply new scheme for other snippets in "count" chapter. Very good, looking forward to seeing them! Thanx, Paul > Thanks, Akira > > > > > Thanx, Paul > > > > ------------------------------------------------------------------------ > > > > commit cc0409cfb0581924b05744a15fa3ebd5c1e298af > > Author: Paul E. McKenney <paulmck@linux.ibm.com> > > Date: Tue Oct 2 11:19:33 2018 -0700 > > > > count: Update code description and QQ based on {READ,WRITE}_ONCE() > > > > Reported-by: Akira Yokosawa <akiyks@gmail.com> > > Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> > > > > diff --git a/count/count.tex b/count/count.tex > > index 9f9fd0f64838..c36e7d826a7b 100644 > > --- a/count/count.tex > > +++ b/count/count.tex > > @@ -495,23 +495,22 @@ show a function that increments the counters, using the > > thread's element of the \co{counter} array. > > Because this element is modified only by the corresponding thread, > > non-atomic increment suffices. > > - > > -Lines~\lnref{read:b}-\lnref{read:e} > > -show a function that reads out the aggregate value of the counter, > > -using the \co{for_each_thread()} primitive to iterate over the list of > > -currently running threads, and using the \co{per_thread()} primitive > > -to fetch the specified thread's counter. > > -Because the hardware can fetch and store a properly aligned \co{unsigned long} > > -atomically, and because \GCC\ is kind enough to make use of this capability, > > -normal loads suffice, and no special atomic instructions are required. > > -\end{lineref} > > +However, this code uses \co{WRITE_ONCE()} to prevent destructive compiler > > +optimizations. > > +For but one example, the compiler is within its rights to use a > > +to-be-stored-to location as temporary storage, thus writing what > > +would be for all intents and purposes garbage to that location > > +just before doing the desired store. > > +This could of course be rather confusing to anything attempting to > > +read out the count. > > +The use of \co{WRITE_ONCE()} prevents this optimization and others besides. > > > > \QuickQuiz{} > > - What other choice does \GCC\ have, anyway??? > > + What other nasty optimizations could \GCC\ apply? > > \QuickQuizAnswer{ > > - According to the C standard, the effects of fetching a variable > > - that might be concurrently modified by some other thread are > > - undefined. > > + According to the C standard, the effects of doing a normal store > > + to a variable that might be concurrently loaded by some other > > + thread are undefined. > > It turns out that the C standard really has no other choice, > > given that C must support (for example) eight-bit architectures > > which are incapable of atomically loading a \co{long}. > > @@ -540,8 +539,10 @@ normal loads suffice, and no special atomic instructions are required. > > It is therefore reasonable to expect that any Linux-kernel > > adoption of C11 atomics will be incremental at best. > > > > - So why not just use plain old C-language assignment statements > > - to access shared variables? > > + But if the code is doing loads and stores that the underlying > > + hardware can implement with single instructions, why not just > > + use plain old C-language assignment statements to access shared > > + variables? > > The problem is that the compiler is within its rights to assume > > that no other thread is modifying the variable being accessed. > > Given a plain old C-language load, the compiler would therefore > > @@ -581,6 +582,22 @@ normal loads suffice, and no special atomic instructions are required. > > coding standards. > > } \QuickQuizEnd > > > > +Lines~\lnref{read:b}-\lnref{read:e} > > +show a function that reads out the aggregate value of the counter, > > +using the \co{for_each_thread()} primitive to iterate over the list of > > +currently running threads, and using the \co{per_thread()} primitive > > +to fetch the specified thread's counter. > > +This code also uses \co{READ_ONCE()} to ensure that the compiler doesn't > > +optimize these loads into oblivion. > > +For but one example, a pair of consecutive calls to \co{read_count()} > > +might be inlined, and an intrepid optimizer might notice that the same > > +locations were being summed and thus incorrectly conclude that it would > > +be simply wonderful to sum them once and use the resulting value twice. > > +This sort of optimization might be rather frustrating to people expecting > > +later \co{read_count()} calls to return larger values. > > +The use of \co{READ_ONCE()} prevents this optimization and others besides. > > +\end{lineref} > > + > > \QuickQuiz{} > > How does the per-thread \co{counter} variable in > > Listing~\ref{lst:count:Array-Based Per-Thread Statistical Counters} > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-10-04 10:32 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-10-02 15:15 [PATCH 1/2] count: Employ new scheme for snippet from count_stat.c Akira Yokosawa 2018-10-02 15:16 ` [PATCH 2/2] count: Adjust type of variable 'counter' with code snippet Akira Yokosawa 2018-10-02 15:24 ` Akira Yokosawa 2018-10-02 18:21 ` Paul E. McKenney 2018-10-02 22:12 ` Akira Yokosawa 2018-10-04 3:40 ` Paul E. McKenney
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.