* [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE
@ 2017-04-01 2:09 Akira Yokosawa
2017-04-01 2:10 ` [RFC PATCH 01/12] advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 1) Akira Yokosawa
` (12 more replies)
0 siblings, 13 replies; 14+ messages in thread
From: Akira Yokosawa @ 2017-04-01 2:09 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From 8aa4157fbed720fd3e8a259de620184003bf09da Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sat, 1 Apr 2017 10:48:45 +0900
Subject: [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE
Hi Paul,
This series mostly corresponds to commit 2e4f5382d12a ("locking/doc:
Rename LOCK/UNLOCK to ACQUIRE/RELEASE") in Linux kernel repository.
Although Documentation/memory-barriers.txt has a lot of updated
contents after the import to perfbook, this series basically
does substitution of existing "LOCK/UNLOCK".
Patches 1--3 do simple substitutions.
Patch 4 does another replacement not included in the original commit.
Patch 5 is an attempt to add a footnote on "LOCK/UNLOCK" wording. You
may want to rewrite the text of the footnote.
Patch 6 replaces some of "the" with "a/an". I think they are reasonable,
but this is from an non-native POV. I might missing something.
Patch 7 is to go along with current wording in memory-barriers.txt.
Patch 8 is a big patch in line count, but it just adds proper nbsps
(in LaTeX sense).
Patch 9 adjusts the position of a footnote appended in the previous
patch set.
Patch 10 is an independent trivial typo fix.
Patches 11 and 12 are tweaks for a better layout.
Thanks, Akira
--
Akira Yokosawa (12):
advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 1)
advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 2)
advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 3)
advsync: More replacement to ACQUIRE
advsync: Add footnote mentioning LOCK/UNLOCK wording
advsync: Modify usage of definite article
advsync: Substitute 'guarantee' with 'implication'
advsync: Properly use nbsp
advsync: Move footnote on transitivity forward
advsync: Fix line number called out
advsync: Add extdash shortcut
advsync: Avoid indent after minipage
advsync/memorybarriers.tex | 504 +++++++++++++++++++++++----------------------
1 file changed, 254 insertions(+), 250 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 14+ messages in thread
* [RFC PATCH 01/12] advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 1)
2017-04-01 2:09 [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Akira Yokosawa
@ 2017-04-01 2:10 ` Akira Yokosawa
2017-04-01 2:11 ` [RFC PATCH 02/12] advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 2) Akira Yokosawa
` (11 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Akira Yokosawa @ 2017-04-01 2:10 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From c8b05232a7665495892fd11ec02e34b1b36b55ac Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sat, 25 Mar 2017 23:58:48 +0900
Subject: [RFC PATCH 01/12] advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 1)
This mostly corresponds to the former half of commit 2e4f5382d12a
("locking/doc: Rename LOCK/UNLOCK to ACQUIRE/RELEASE")[1] in Linux
kernel repository, but only replaces what is present in perfbook
now.
Note:"memory-barriers.txt" has more details not included here.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2e4f5382d12a
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
advsync/memorybarriers.tex | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/advsync/memorybarriers.tex b/advsync/memorybarriers.tex
index 165b5cb..29d3af1 100644
--- a/advsync/memorybarriers.tex
+++ b/advsync/memorybarriers.tex
@@ -1555,36 +1555,36 @@ There are a couple of types of implicit memory barriers, so called
because they are embedded into locking primitives:
\begin{enumerate}
-\item LOCK operations and
-\item UNLOCK operations.
+\item ACQUIRE operations and
+\item RELEASE operations.
\end{enumerate}
-\paragraph{LOCK Operations}
+\paragraph{ACQUIRE Operations}
-A lock operation acts as a one-way permeable barrier.
+An acuire operation acts as a one-way permeable barrier.
It guarantees that all memory
-operations after the LOCK operation will appear to happen after the LOCK
+operations after the ACQUIRE operation will appear to happen after the ACQUIRE
operation with respect to the other components of the system.
-Memory operations that occur before a LOCK operation may appear to happen
+Memory operations that occur before an ACQUIRE operation may appear to happen
after it completes.
-A LOCK operation should almost always be paired with an UNLOCK operation.
+An ACQUIRE operation should almost always be paired with a RELEASE operation.
-\paragraph{UNLOCK Operations}
+\paragraph{RELEASE Operations}
-Unlock operations also act as a one-way permeable barrier.
+Release operations also act as a one-way permeable barrier.
It guarantees that all
-memory operations before the UNLOCK operation will appear to happen before
-the UNLOCK operation with respect to the other components of the system.
+memory operations before the RELEASE operation will appear to happen before
+the RELEASE operation with respect to the other components of the system.
-Memory operations that occur after an UNLOCK operation may appear to
+Memory operations that occur after a RELEASE operation may appear to
happen before it completes.
-LOCK and UNLOCK operations are guaranteed to appear with respect to each
+ACQUIRE and RELEASE operations are guaranteed to appear with respect to each
other strictly in the order specified.
-The use of LOCK and UNLOCK operations generally precludes the need for
+The use of ACQUIRE and RELEASE operations generally precludes the need for
other sorts of memory barrier (but note the exceptions mentioned in the
Section~\ref{sec:advsync:Device Operations}).
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH 02/12] advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 2)
2017-04-01 2:09 [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Akira Yokosawa
2017-04-01 2:10 ` [RFC PATCH 01/12] advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 1) Akira Yokosawa
@ 2017-04-01 2:11 ` Akira Yokosawa
2017-04-01 2:12 ` [RFC PATCH 03/12] advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 3) Akira Yokosawa
` (10 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Akira Yokosawa @ 2017-04-01 2:11 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From 7e4a90f569e7ddfa68e5c71b9bf9edd88446d5c4 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sun, 26 Mar 2017 00:27:40 +0900
Subject: [RFC PATCH 02/12] advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 2)
This mostly corresponds to the middle part of commit 2e4f5382d12a
("locking/doc: Rename LOCK/UNLOCK to ACQUIRE/RELEASE") in Linux
kernel repository. Although current memory-barriers.txt uses
some different wording (such as "LOCK operation implication"),
this commit does simple renames.
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
advsync/memorybarriers.tex | 40 ++++++++++++++++++++--------------------
1 file changed, 20 insertions(+), 20 deletions(-)
diff --git a/advsync/memorybarriers.tex b/advsync/memorybarriers.tex
index 29d3af1..322ffbc 100644
--- a/advsync/memorybarriers.tex
+++ b/advsync/memorybarriers.tex
@@ -2725,35 +2725,35 @@ barriers throughout.
As noted earlier, locking primitives contain implicit memory barriers.
These implicit memory barriers provide the following guarantees:
\begin{enumerate}
-\item LOCK operation guarantee:
+\item ACQUIRE operation guarantee:
\begin{itemize}
- \item Memory operations issued after the LOCK will be completed
- after the LOCK operation has completed.
- \item Memory operations issued before the LOCK may be completed
- after the LOCK operation has completed.
+ \item Memory operations issued after the ACQUIRE will be completed
+ after the ACQUIRE operation has completed.
+ \item Memory operations issued before the ACQUIRE may be completed
+ after the ACQUIRE operation has completed.
\end{itemize}
-\item UNLOCK operation guarantee:
+\item RELEASE operation guarantee:
\begin{itemize}
- \item Memory operations issued before the UNLOCK will be
- completed before the UNLOCK operation has completed.
- \item Memory operations issued after the UNLOCK may be completed
- before the UNLOCK operation has completed.
+ \item Memory operations issued before the RELEASE will be
+ completed before the RELEASE operation has completed.
+ \item Memory operations issued after the RELEASE may be completed
+ before the RELEASE operation has completed.
\end{itemize}
-\item LOCK vs LOCK guarantee:
+\item ACQUIRE vs ACQUIRE guarantee:
\begin{itemize}
- \item All LOCK operations issued before another LOCK operation
- will be completed before that LOCK operation.
+ \item All ACQUIRE operations issued before another ACQUIRE operation
+ will be completed before that ACQUIRE operation.
\end{itemize}
-\item LOCK vs UNLOCK guarantee:
+\item ACQUIRE vs RELEASE guarantee:
\begin{itemize}
- \item All LOCK operations issued before an UNLOCK operation
- will be completed before the UNLOCK operation.
- \item All UNLOCK operations issued before a LOCK operation
- will be completed before the LOCK operation.
+ \item All ACQUIRE operations issued before a RELEASE operation
+ will be completed before the RELEASE operation.
+ \item All RELEASE operations issued before an ACQUIRE operation
+ will be completed before the ACQUIRE operation.
\end{itemize}
-\item Failed conditional LOCK guarantee:
+\item Failed conditional ACQUIRE guarantee:
\begin{itemize}
- \item Certain variants of the LOCK operation may fail, either
+ \item Certain variants of the ACQUIRE operation may fail, either
due to being unable to get the lock immediately, or due
to receiving an unblocked signal or exception
whilst asleep waiting
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH 03/12] advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 3)
2017-04-01 2:09 [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Akira Yokosawa
2017-04-01 2:10 ` [RFC PATCH 01/12] advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 1) Akira Yokosawa
2017-04-01 2:11 ` [RFC PATCH 02/12] advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 2) Akira Yokosawa
@ 2017-04-01 2:12 ` Akira Yokosawa
2017-04-01 2:13 ` [RFC PATCH 04/12] advsync: More replacement to ACQUIRE Akira Yokosawa
` (9 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Akira Yokosawa @ 2017-04-01 2:12 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From 24e62a5c8dff4f8af1120900c8c9695e573f6758 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sun, 26 Mar 2017 00:55:06 +0900
Subject: [RFC PATCH 03/12] advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 3)
This mostly corresponds to the final part of commit 2e4f5382d12a
("locking/doc: Rename LOCK/UNLOCK to ACQUIRE/RELEASE") in Linux
kernel repository. This commit does simple replacements.
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
advsync/memorybarriers.tex | 90 +++++++++++++++++++++++-----------------------
1 file changed, 45 insertions(+), 45 deletions(-)
diff --git a/advsync/memorybarriers.tex b/advsync/memorybarriers.tex
index 322ffbc..8471f0a 100644
--- a/advsync/memorybarriers.tex
+++ b/advsync/memorybarriers.tex
@@ -2767,10 +2767,10 @@ These implicit memory barriers provide the following guarantees:
\subsubsection{Locking Examples}
-\paragraph{LOCK Followed by UNLOCK:}
-A LOCK followed by an UNLOCK may not be assumed to be a full memory barrier
-because it is possible for an access preceding the LOCK to happen after the
-LOCK, and an access following the UNLOCK to happen before the UNLOCK, and the
+\paragraph{ACQUIRE Followed by RELEASE:}
+An ACQUIRE followed by a RELEASE may not be assumed to be a full memory barrier
+because it is possible for an access preceding the ACQUIRE to happen after the
+ACQUIRE, and an access following the RELEASE to happen before the RELEASE, and the
two accesses can themselves then cross.
For example, the following:
@@ -2779,8 +2779,8 @@ For example, the following:
\scriptsize
\begin{verbatim}
1 *A = a;
- 2 LOCK
- 3 UNLOCK
+ 2 ACQUIRE
+ 3 RELEASE
4 *B = b;
\end{verbatim}
\end{minipage}
@@ -2792,24 +2792,24 @@ might well execute in the following order:
\begin{minipage}[t]{\columnwidth}
\scriptsize
\begin{verbatim}
- 2 LOCK
+ 2 ACQUIRE
4 *B = b;
1 *A = a;
- 3 UNLOCK
+ 3 RELEASE
\end{verbatim}
\end{minipage}
\vspace{5pt}
-Again, always remember that LOCK and UNLOCK are permitted to let preceding
+Again, always remember that ACQUIRE and RELEASE are permitted to let preceding
operations and following operations ``bleed in'' to the critical section
respectively.
\QuickQuiz{}
- What sequence of LOCK-UNLOCK operations \emph{would}
+ What sequence of ACQUIRE-RELEASE operations \emph{would}
act as a full memory barrier?
\QuickQuizAnswer{
- A series of two back-to-back LOCK-UNLOCK operations, or, somewhat
- less conventionally, an UNLOCK operation followed by a LOCK
+ A series of two back-to-back ACQUIRE-RELEASE operations, or, somewhat
+ less conventionally, a RELEASE operation followed by an ACQUIRE
operation.
} \QuickQuizEnd
@@ -2823,8 +2823,8 @@ respectively.
exercise for the reader.
} \QuickQuizEnd
-\paragraph{LOCK-Based Critical Sections:}
-Although a LOCK-UNLOCK pair does not act as a full memory barrier,
+\paragraph{Lock-Based Critical Sections:}
+Although an ACQUIRE-RELEASE pair does not act as a full memory barrier,
these operations \emph{do} affect memory ordering.
Consider the following code:
@@ -2835,10 +2835,10 @@ Consider the following code:
\begin{verbatim}
1 *A = a;
2 *B = b;
- 3 LOCK
+ 3 ACQUIRE
4 *C = c;
5 *D = d;
- 6 UNLOCK
+ 6 RELEASE
7 *E = e;
8 *F = f;
\end{verbatim}
@@ -2853,12 +2853,12 @@ operations concurrently:
\begin{minipage}[t]{\columnwidth}
\scriptsize
\begin{verbatim}
- 3 LOCK
+ 3 ACQUIRE
1 *A = a; *F = f;
7 *E = e;
4 *C = c; *D = d;
2 *B = b;
- 6 UNLOCK
+ 6 RELEASE
\end{verbatim}
\end{minipage}
\vspace{5pt}
@@ -2869,23 +2869,23 @@ operations concurrently:
\# & Ordering: legitimate or not? \\
\hline
\hline
- 1 & \verb|*A; *B; LOCK; *C; *D; UNLOCK; *E; *F;| \\
+ 1 & \verb|*A; *B; ACQUIRE; *C; *D; RELEASE; *E; *F;| \\
\hline
- 2 & \verb|*A; {*B; LOCK;} *C; *D; UNLOCK; *E; *F;| \\
+ 2 & \verb|*A; {*B; ACQUIRE;} *C; *D; RELEASE; *E; *F;| \\
\hline
- 3 & \verb|{*F; *A;} *B; LOCK; *C; *D; UNLOCK; *E;| \\
+ 3 & \verb|{*F; *A;} *B; ACQUIRE; *C; *D; RELEASE; *E;| \\
\hline
- 4 & \verb|*A; *B; {LOCK; *C;} *D; {UNLOCK; *E;} *F;| \\
+ 4 & \verb|*A; *B; {ACQUIRE; *C;} *D; {RELEASE; *E;} *F;| \\
\hline
- 5 & \verb|*B; LOCK; *C; *D; *A; UNLOCK; *E; *F;| \\
+ 5 & \verb|*B; ACQUIRE; *C; *D; *A; RELEASE; *E; *F;| \\
\hline
- 6 & \verb|*A; *B; *C; LOCK; *D; UNLOCK; *E; *F;| \\
+ 6 & \verb|*A; *B; *C; ACQUIRE; *D; RELEASE; *E; *F;| \\
\hline
- 7 & \verb|*A; *B; LOCK; *C; UNLOCK; *D; *E; *F;| \\
+ 7 & \verb|*A; *B; ACQUIRE; *C; RELEASE; *D; *E; *F;| \\
\hline
- 8 & \verb|{*B; *A; LOCK;} {*D; *C;} {UNLOCK; *F; *E;}| \\
+ 8 & \verb|{*B; *A; ACQUIRE;} {*D; *C;} {RELEASE; *F; *E;}| \\
\hline
- 9 & \verb|*B; LOCK; *C; *D; UNLOCK; {*F; *A;} *E;| \\
+ 9 & \verb|*B; ACQUIRE; *C; *D; RELEASE; {*F; *A;} *E;| \\
\end{tabular}
\caption{Lock-Based Critical Sections}
\label{tab:advsync:Lock-Based Critical Sections}
@@ -2896,26 +2896,26 @@ operations concurrently:
concurrently, which of the rows of
Table~\ref{tab:advsync:Lock-Based Critical Sections}
are legitimate reorderings of the assignments to variables
- ``A'' through ``F'' and the LOCK/UNLOCK operations?
- (The order in the code is A, B, LOCK, C, D, UNLOCK, E, F.)
+ ``A'' through ``F'' and the ACQUIRE/RELEASE operations?
+ (The order in the code is {\tt *A, *B, ACQUIRE, *C, *D, RELEASE, *E, *F}.)
Why or why not?
\QuickQuizAnswer{
\begin{enumerate}
\item Legitimate, executed in order.
\item Legitimate, the lock acquisition was executed concurrently
with the last assignment preceding the critical section.
- \item Illegitimate, the assignment to ``F'' must follow the LOCK
+ \item Illegitimate, the assignment to ``F'' must follow the ACQUIRE
operation.
- \item Illegitimate, the LOCK must complete before any operation in
- the critical section. However, the UNLOCK may legitimately
+ \item Illegitimate, the ACQUIRE must complete before any operation in
+ the critical section. However, the RELEASE may legitimately
be executed concurrently with subsequent operations.
- \item Legitimate, the assignment to ``A'' precedes the UNLOCK,
+ \item Legitimate, the assignment to ``A'' precedes the RELEASE,
as required, and all other operations are in order.
- \item Illegitimate, the assignment to ``C'' must follow the LOCK.
- \item Illegitimate, the assignment to ``D'' must precede the UNLOCK.
+ \item Illegitimate, the assignment to ``C'' must follow the ACQUIRE.
+ \item Illegitimate, the assignment to ``D'' must precede the RELEASE.
\item Legitimate, all assignments are ordered with respect to the
- LOCK and UNLOCK operations.
- \item Illegitimate, the assignment to ``A'' must precede the UNLOCK.
+ ACQUIRE and RELEASE operations.
+ \item Illegitimate, the assignment to ``A'' must precede the RELEASE.
\end{enumerate}
} \QuickQuizEnd
@@ -2932,10 +2932,10 @@ a pair of locks named ``M'' and ``Q''.
\nf{CPU 1} & \nf{CPU 2} \\
\hline
A = a; & E = e; \\
- LOCK M; & LOCK Q; \\
+ ACQUIRE M; & ACQUIRE Q; \\
B = b; & F = f; \\
C = c; & G = g; \\
- UNLOCK M; & UNLOCK Q; \\
+ RELEASE M; & RELEASE Q; \\
D = d; & H = h; \\
\end{tabular}}
\caption{Ordering With Multiple Locks}
@@ -2953,10 +2953,10 @@ described in the previous section.
\QuickQuizAnswer{
All CPUs must see the following ordering constraints:
\begin{enumerate}
- \item LOCK M precedes B, C, and D.
- \item UNLOCK M follows A, B, and C.
- \item LOCK Q precedes F, G, and H.
- \item UNLOCK Q follows E, F, and G.
+ \item ACQUIRE M precedes B, C, and D.
+ \item RELEASE M follows A, B, and C.
+ \item ACQUIRE Q precedes F, G, and H.
+ \item RELEASE Q follows E, F, and G.
\end{enumerate}
} \QuickQuizEnd
@@ -2972,10 +2972,10 @@ Table~\ref{tab:advsync:Ordering With Multiple CPUs on One Lock}?
\nf{CPU 1} & \nf{CPU 2} \\
\hline
A = a; & E = e; \\
- LOCK M; & LOCK M; \\
+ ACQUIRE M; & ACQUIRE M; \\
B = b; & F = f; \\
C = c; & G = g; \\
- UNLOCK M; & UNLOCK M; \\
+ RELEASE M; & RELEASE M; \\
D = d; & H = h; \\
\end{tabular}}
\caption{Ordering With Multiple CPUs on One Lock}
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH 04/12] advsync: More replacement to ACQUIRE
2017-04-01 2:09 [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Akira Yokosawa
` (2 preceding siblings ...)
2017-04-01 2:12 ` [RFC PATCH 03/12] advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 3) Akira Yokosawa
@ 2017-04-01 2:13 ` Akira Yokosawa
2017-04-01 2:14 ` [RFC PATCH 05/12] advsync: Add footnote mentioning LOCK/UNLOCK wording Akira Yokosawa
` (8 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Akira Yokosawa @ 2017-04-01 2:13 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From d2a400201e3f0379be01a296ebefee74ae106410 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Thu, 30 Mar 2017 00:03:50 +0900
Subject: [RFC PATCH 04/12] advsync: More replacement to ACQUIRE
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
advsync/memorybarriers.tex | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/advsync/memorybarriers.tex b/advsync/memorybarriers.tex
index 8471f0a..ae6ca9f 100644
--- a/advsync/memorybarriers.tex
+++ b/advsync/memorybarriers.tex
@@ -2757,7 +2757,7 @@ These implicit memory barriers provide the following guarantees:
due to being unable to get the lock immediately, or due
to receiving an unblocked signal or exception
whilst asleep waiting
- for the lock to become available. Failed locks do not
+ for the lock to become available. Failed ACQUIREs do not
imply any sort of barrier.
\end{itemize}
\end{enumerate}
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH 05/12] advsync: Add footnote mentioning LOCK/UNLOCK wording
2017-04-01 2:09 [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Akira Yokosawa
` (3 preceding siblings ...)
2017-04-01 2:13 ` [RFC PATCH 04/12] advsync: More replacement to ACQUIRE Akira Yokosawa
@ 2017-04-01 2:14 ` Akira Yokosawa
2017-04-01 2:16 ` [RFC PATCH 06/12] advsync: Modify usage of definite article Akira Yokosawa
` (7 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Akira Yokosawa @ 2017-04-01 2:14 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From 41b970ba205df9069ce9921211e6214a4c138354 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sun, 26 Mar 2017 00:16:07 +0900
Subject: [RFC PATCH 05/12] advsync: Add footnote mentioning LOCK/UNLOCK wording
The text of the footnote should be regarded as a stub.
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
advsync/memorybarriers.tex | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/advsync/memorybarriers.tex b/advsync/memorybarriers.tex
index ae6ca9f..60d32b0 100644
--- a/advsync/memorybarriers.tex
+++ b/advsync/memorybarriers.tex
@@ -1556,7 +1556,11 @@ because they are embedded into locking primitives:
\begin{enumerate}
\item ACQUIRE operations and
-\item RELEASE operations.
+\item RELEASE operations.\footnote{
+ Note that there were times when ACQUIRE/RELEASE operations
+ were called as ``LOCK/UNLOCK'' operations in the Linux
+ kernel community. ``ACQUIRE/RELEASE'' is preferred in
+ discussing lockless techniques.}
\end{enumerate}
\paragraph{ACQUIRE Operations}
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH 06/12] advsync: Modify usage of definite article
2017-04-01 2:09 [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Akira Yokosawa
` (4 preceding siblings ...)
2017-04-01 2:14 ` [RFC PATCH 05/12] advsync: Add footnote mentioning LOCK/UNLOCK wording Akira Yokosawa
@ 2017-04-01 2:16 ` Akira Yokosawa
2017-04-01 2:17 ` [RFC PATCH 07/12] advsync: Substitute 'guarantee' with 'implication' Akira Yokosawa
` (6 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Akira Yokosawa @ 2017-04-01 2:16 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From 6dcde543c62b0effd48ae684fd0a305f0ae8338e Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sun, 26 Mar 2017 08:55:07 +0900
Subject: [RFC PATCH 06/12] advsync: Modify usage of definite article
These "ACQUIRE/RELEASE"s do not need definite articles.
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
advsync/memorybarriers.tex | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/advsync/memorybarriers.tex b/advsync/memorybarriers.tex
index 60d32b0..6215311 100644
--- a/advsync/memorybarriers.tex
+++ b/advsync/memorybarriers.tex
@@ -2731,16 +2731,16 @@ These implicit memory barriers provide the following guarantees:
\begin{enumerate}
\item ACQUIRE operation guarantee:
\begin{itemize}
- \item Memory operations issued after the ACQUIRE will be completed
+ \item Memory operations issued after an ACQUIRE will be completed
after the ACQUIRE operation has completed.
- \item Memory operations issued before the ACQUIRE may be completed
+ \item Memory operations issued before an ACQUIRE may be completed
after the ACQUIRE operation has completed.
\end{itemize}
\item RELEASE operation guarantee:
\begin{itemize}
- \item Memory operations issued before the RELEASE will be
+ \item Memory operations issued before a RELEASE will be
completed before the RELEASE operation has completed.
- \item Memory operations issued after the RELEASE may be completed
+ \item Memory operations issued after a RELEASE may be completed
before the RELEASE operation has completed.
\end{itemize}
\item ACQUIRE vs ACQUIRE guarantee:
@@ -2757,7 +2757,7 @@ These implicit memory barriers provide the following guarantees:
\end{itemize}
\item Failed conditional ACQUIRE guarantee:
\begin{itemize}
- \item Certain variants of the ACQUIRE operation may fail, either
+ \item Certain variants of ACQUIRE operation may fail, either
due to being unable to get the lock immediately, or due
to receiving an unblocked signal or exception
whilst asleep waiting
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH 07/12] advsync: Substitute 'guarantee' with 'implication'
2017-04-01 2:09 [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Akira Yokosawa
` (5 preceding siblings ...)
2017-04-01 2:16 ` [RFC PATCH 06/12] advsync: Modify usage of definite article Akira Yokosawa
@ 2017-04-01 2:17 ` Akira Yokosawa
2017-04-01 2:19 ` [RFC PATCH 08/12] advsync: Properly use nbsp Akira Yokosawa
` (5 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Akira Yokosawa @ 2017-04-01 2:17 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From 8dac6729398f02c38bbe9df6b4ad137678420ac6 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sun, 26 Mar 2017 09:36:22 +0900
Subject: [RFC PATCH 07/12] advsync: Substitute 'guarantee' with 'implication'
Modify wording to go along with what is found in current
memory-barriers.txt. (Although this part is under Section
"LOCK ACQUISITION FUNCTIONS" now.)
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
advsync/memorybarriers.tex | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/advsync/memorybarriers.tex b/advsync/memorybarriers.tex
index 6215311..64d78f4 100644
--- a/advsync/memorybarriers.tex
+++ b/advsync/memorybarriers.tex
@@ -2727,35 +2727,35 @@ barriers throughout.
\label{sec:advsync:Locking Constraints}
As noted earlier, locking primitives contain implicit memory barriers.
-These implicit memory barriers provide the following guarantees:
+These primitives imply the following:
\begin{enumerate}
-\item ACQUIRE operation guarantee:
+\item ACQUIRE operation implication:
\begin{itemize}
\item Memory operations issued after an ACQUIRE will be completed
after the ACQUIRE operation has completed.
\item Memory operations issued before an ACQUIRE may be completed
after the ACQUIRE operation has completed.
\end{itemize}
-\item RELEASE operation guarantee:
+\item RELEASE operation implication:
\begin{itemize}
\item Memory operations issued before a RELEASE will be
completed before the RELEASE operation has completed.
\item Memory operations issued after a RELEASE may be completed
before the RELEASE operation has completed.
\end{itemize}
-\item ACQUIRE vs ACQUIRE guarantee:
+\item ACQUIRE vs ACQUIRE implication:
\begin{itemize}
\item All ACQUIRE operations issued before another ACQUIRE operation
will be completed before that ACQUIRE operation.
\end{itemize}
-\item ACQUIRE vs RELEASE guarantee:
+\item ACQUIRE vs RELEASE implication:
\begin{itemize}
\item All ACQUIRE operations issued before a RELEASE operation
will be completed before the RELEASE operation.
\item All RELEASE operations issued before an ACQUIRE operation
will be completed before the ACQUIRE operation.
\end{itemize}
-\item Failed conditional ACQUIRE guarantee:
+\item Failed conditional ACQUIRE implication:
\begin{itemize}
\item Certain variants of ACQUIRE operation may fail, either
due to being unable to get the lock immediately, or due
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH 08/12] advsync: Properly use nbsp
2017-04-01 2:09 [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Akira Yokosawa
` (6 preceding siblings ...)
2017-04-01 2:17 ` [RFC PATCH 07/12] advsync: Substitute 'guarantee' with 'implication' Akira Yokosawa
@ 2017-04-01 2:19 ` Akira Yokosawa
2017-04-01 2:21 ` [RFC PATCH 09/12] advsync: Move footnote on transitivity forward Akira Yokosawa
` (4 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Akira Yokosawa @ 2017-04-01 2:19 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From eb80c7c23661824d63bb8e97c951dce03e158769 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sun, 26 Mar 2017 16:39:51 +0900
Subject: [RFC PATCH 08/12] advsync: Properly use nbsp
These nbsps prevent undesirable line breaks such as
... variable
------------ page/column break ------------
A ...
That "A" might give a wrong impression as if it were an indefinite
article.
Single letter variable names and short numbers at the beginning of
a line should be avoided in the same sense we prevent line breaks
between "Figure" and "\ref{}".
This commit also adds nbsps in lists of initial values as was done in
commit 09f380606ca7 ("advsync: Properly use nbsp in initial values").
Some short conditionals are enclosed in \nbco{} to avoid line
breaks.
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
advsync/memorybarriers.tex | 336 ++++++++++++++++++++++-----------------------
1 file changed, 168 insertions(+), 168 deletions(-)
diff --git a/advsync/memorybarriers.tex b/advsync/memorybarriers.tex
index 64d78f4..173d7ad 100644
--- a/advsync/memorybarriers.tex
+++ b/advsync/memorybarriers.tex
@@ -32,13 +32,13 @@ in shared memory.
For example, the litmus test in
Table~\ref{tab:advsync:Memory Misordering: Dekker}
appears to guarantee that the assertion never fires.
-After all, if \co{r1 != 1}, we might hope that Thread~1's load from \co{y}
-must have happened before Thread~2's store to \co{y}, which might raise
-further hopes that Thread~2's load from \co{x} must happen after
-Thread~1's store to \co{x}, so that \co{r2 == 1}, as required by the
+After all, if \nbco{r1 != 1}, we might hope that Thread~1's load from~\co{y}
+must have happened before Thread~2's store to~\co{y}, which might raise
+further hopes that Thread~2's load from~\co{x} must happen after
+Thread~1's store to~\co{x}, so that \nbco{r2 == 1}, as required by the
assertion.
The example is symmetric, so similar hopeful reasoning might lead
-us to hope that \co{r2 != 1} guarantees that \co{r1 == 1}.
+us to hope that \nbco{r2 != 1} guarantees that \nbco{r1 == 1}.
Unfortunately, the lack of memory barriers in
Table~\ref{tab:advsync:Memory Misordering: Dekker}
dashes these hopes.
@@ -138,7 +138,7 @@ Memory ordering and memory barriers can be extremely counter-intuitive.
For example, consider the functions shown in
Figure~\ref{fig:advsync:Parallel Hardware is Non-Causal}
executing in parallel
-where variables A, B, and C are initially zero:
+where variables~A, B, and~C are initially zero:
\begin{figure}[htbp]
{ \scriptsize
@@ -173,10 +173,10 @@ where variables A, B, and C are initially zero:
\label{fig:advsync:Parallel Hardware is Non-Causal}
\end{figure}
-Intuitively, \co{thread0()} assigns to B after it assigns to A,
-\co{thread1()} waits until \co{thread0()} has assigned to B before
-assigning to C, and \co{thread2()} waits until \co{thread1()} has
-assigned to C before referencing A.
+Intuitively, \co{thread0()} assigns to~B after it assigns to~A,
+\co{thread1()} waits until \co{thread0()} has assigned to~B before
+assigning to~C, and \co{thread2()} waits until \co{thread1()} has
+assigned to~C before referencing~A.
Therefore, again intuitively, the assertion on line~21 cannot possibly
fire.
@@ -185,7 +185,7 @@ and utterly incorrect.
Please note that this is \emph{not} a theoretical assertion:
actually running this code on real-world weakly-ordered hardware
(a 1.5GHz 16-CPU POWER 5 system) resulted in the assertion firing
-16 times out of 10 million runs.
+16~times out of 10~million runs.
Clearly, anyone who produces code with explicit memory barriers
should do some extreme testing---although a proof of correctness might
be helpful, the strongly counter-intuitive nature of the behavior of
@@ -201,8 +201,8 @@ greatly \emph{increase} the probability of failure in this run.
\emph{possibly} fail?
\QuickQuizAnswer{
The key point is that the intuitive analysis missed is that
- there is nothing preventing the assignment to C from overtaking
- the assignment to A as both race to reach {\tt thread2()}.
+ there is nothing preventing the assignment to~C from overtaking
+ the assignment to~A as both race to reach \co{thread2()}.
This is explained in the remainder of this section.
} \QuickQuizEnd
@@ -309,11 +309,11 @@ with the black regions to the left indicating the time before the
corresponding CPU's first measurement.
During the first 5ns, only CPU~3 has an opinion about the value of the
variable.
-During the next 10ns, CPUs~2 and 3 disagree on the value of the variable,
-but thereafter agree that the value is ``2'', which is in fact
+During the next 10ns, CPUs~2 and~3 disagree on the value of the variable,
+but thereafter agree that the value is~``2'', which is in fact
the final agreed-upon value.
-However, CPU~1 believes that the value is ``1'' for almost 300ns, and
-CPU~4 believes that the value is ``4'' for almost 500ns.
+However, CPU~1 believes that the value is~``1'' for almost 300ns, and
+CPU~4 believes that the value is~``4'' for almost 500ns.
\QuickQuiz{}
How could CPUs possibly have different views of the
@@ -331,15 +331,15 @@ CPU~4 believes that the value is ``4'' for almost 500ns.
} \QuickQuizEnd
\QuickQuiz{}
- Why do CPUs~2 and 3 come to agreement so quickly, when it
- takes so long for CPUs~1 and 4 to come to the party?
+ Why do CPUs~2 and~3 come to agreement so quickly, when it
+ takes so long for CPUs~1 and~4 to come to the party?
\QuickQuizAnswer{
- CPUs~2 and 3 are a pair of hardware threads on the same
+ CPUs~2 and~3 are a pair of hardware threads on the same
core, sharing the same cache hierarchy, and therefore have
very low communications latencies.
This is a NUMA, or, more accurately, a NUCA effect.
- This leads to the question of why CPUs~2 and 3 ever disagree
+ This leads to the question of why CPUs~2 and~3 ever disagree
at all.
One possible reason is that they each might have a small amount
of private cache in addition to a larger shared cache.
@@ -350,17 +350,17 @@ CPU~4 believes that the value is ``4'' for almost 500ns.
And if you think that the situation with four CPUs was intriguing, consider
Figure~\ref{fig:advsync:A Variable With More Simultaneous Values},
-which shows the same situation, but with 15 CPUs each assigning their
-number to a single shared variable at time $t=0$. Both diagrams in the
+which shows the same situation, but with 15~CPUs each assigning their
+number to a single shared variable at time~$t=0$. Both diagrams in the
figure are drawn in the same way as
Figure~\ref{fig:advsync:A Variable With Multiple Simultaneous Values}.
The only difference is that the unit of horizontal axis is timebase ticks,
-with each tick lasting about 5.3 nanoseconds.
+with each tick lasting about 5.3~nanoseconds.
The entire sequence therefore lasts a bit longer than the events recorded in
Figure~\ref{fig:advsync:A Variable With Multiple Simultaneous Values},
consistent with the increase in number of CPUs.
The upper diagram shows the overall picture, while the lower one shows
-the zoom-up of first 50 timebase ticks.
+the zoom-up of first 50~timebase ticks.
Again, CPU~0 coordinates the test, so does not record any values.
@@ -371,10 +371,10 @@ Again, CPU~0 coordinates the test, so does not record any values.
\ContributedBy{Figure}{fig:advsync:A Variable With More Simultaneous Values}{Akira Yokosawa}
\end{figure*}
-All CPUs eventually agree on the final value of 9, but not before
-the values 15 and 12 take early leads.
+All CPUs eventually agree on the final value of~9, but not before
+the values~15 and~12 take early leads.
Note that there are fourteen different opinions on the variable's value
-at time 21 indicated by the vertical line in the lower diagram.
+at time~21 indicated by the vertical line in the lower diagram.
Note also that all CPUs see sequences whose orderings are consistent with
the directed graph shown in
Figure~\ref{fig:advsync:Possible Global Orders With More Simultaneous Values}.
@@ -468,7 +468,7 @@ and
CPU~4 sees the sequence {\tt \{4,2\}}.
This is consistent with the global sequence {\tt \{3,1,4,2\}},
but also with all five of the other sequences of these four numbers that end
-in ``2''.
+in~``2''.
Thus, there will be agreement on the sequence of values taken on
by a single variable, but there might be ambiguity.
@@ -477,7 +477,7 @@ In contrast, had the CPUs used atomic operations (such as the Linux kernel's
unique values, their observations would
be guaranteed to determine a single globally consistent sequence of values.
One of the \co{atomic_inc_return()} invocations would happen first,
-and would change the value from 0 to 1, the second from 1 to 2, and
+and would change the value from~0 to~1, the second from~1 to~2, and
so on.
The CPUs could compare notes afterwards and come to agreement on the
exact ordering of the sequence of \co{atomic_inc_return()} invocations.
@@ -500,7 +500,7 @@ commercially available computer systems.
Pair-wise memory barriers provide conditional ordering semantics.
For example, in the following set of operations, CPU~1's access to
-A does not unconditionally precede its access to B from the viewpoint
+A does not unconditionally precede its access to~B from the viewpoint
of an external logic analyzer
\IfInBook{(see Appendix~\ref{chp:app:whymb:Why Memory Barriers?}
for examples).
@@ -508,9 +508,9 @@ of an external logic analyzer
{(the system is only to act \emph{as if} the accesses
are in order; it is not necessarily required to actually
force them to be in order).}
-However, if CPU~2's access to B sees the result of CPU~1's access
-to B, then CPU~2's access to A is guaranteed to see the result of
-CPU~1's access to A.
+However, if CPU~2's access to~B sees the result of CPU~1's access
+to~B, then CPU~2's access to~A is guaranteed to see the result of
+CPU~1's access to~A.
Although some CPU families' memory barriers do in fact provide stronger,
unconditional ordering guarantees, portable code may rely only
on this weaker if-then conditional ordering guarantee.
@@ -608,7 +608,7 @@ pairings that portable software may depend on.
In this pairing, one CPU executes a pair of loads separated
by a memory barrier, while a second CPU executes a pair
of stores also separated by a memory barrier, as follows
- (both A and B are initially equal to zero):
+ (both~A and~B are initially equal to zero):
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -635,12 +635,12 @@ pairings that portable software may depend on.
\co{X==1}.
On the other hand, if \co{Y==0}, the memory-barrier condition
- does not hold, and so in this case, X could be either 0 or 1.
+ does not hold, and so in this case, X~could be either~0 or~1.
\paragraph{Pairing 2.}
In this pairing, each CPU executes a load followed by a
memory barrier followed by a store, as follows
- (both A and B are initially equal to zero):
+ (both~A and~B are initially equal to zero):
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -667,7 +667,7 @@ pairings that portable software may depend on.
so that \co{Y==0}.
On the other hand, if \co{X==0}, the memory-barrier condition
- does not hold, and so in this case, Y could be either 0 or 1.
+ does not hold, and so in this case, Y~could be either~0 or~1.
The two CPUs' code sequences are symmetric, so if \co{Y==1}
after both CPUs have finished executing these code sequences,
@@ -677,7 +677,7 @@ pairings that portable software may depend on.
In this pairing, one CPU executes a load followed by a
memory barrier followed by a store, while the other CPU
executes a pair of stores separated by a memory barrier,
- as follows (both A and B are initially equal to zero):
+ as follows (both~A and~B are initially equal to zero):
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -701,11 +701,11 @@ pairings that portable software may depend on.
Due to the pairwise nature of memory barriers, CPU~1's
store following its memory barrier must therefore see
the results of CPU~2's store preceding its memory barrier.
- This means that CPU~1's store to B will overwrite CPU~2's
- store to B, resulting in \co{B==1}.
+ This means that CPU~1's store to~B will overwrite CPU~2's
+ store to~B, resulting in \co{B==1}.
On the other hand, if \co{X==0}, the memory-barrier condition
- does not hold, and so in this case, B could be either 1 or 2.
+ does not hold, and so in this case, B~could be either~1 or~2.
\subsubsection{Pair-Wise Memory Barriers: Semi-Portable Combinations}
@@ -733,7 +733,7 @@ keep in mind that they used to be a \emph{lot} harder on some systems!
one of the loads will see the value stored by the other thread
in the ears-to-mouths scenario?
\QuickQuizAnswer{
- The scenario is as follows, with A and B both initially zero:
+ The scenario is as follows, with~A and~B both initially zero:
CPU~0: A=1; \co{smp_mb()}; r1=B;
@@ -742,12 +742,12 @@ keep in mind that they used to be a \emph{lot} harder on some systems!
If neither of the loads see the corresponding store, when both
CPUs finish, both \co{r1} and \co{r2} will be equal to zero.
Let's suppose that \co{r1} is equal to zero.
- Then we know that CPU~0's load from B happened before CPU~1's
- store to B: After all, we would have had \co{r1} equal to one
+ Then we know that CPU~0's load from~B happened before CPU~1's
+ store to~B: After all, we would have had \co{r1} equal to one
otherwise.
- But given that CPU~0's load from B happened before CPU~1's store
- to B, memory-barrier pairing guarantees that CPU~0's store to A
- happens before CPU~1's load from A, which in turn guarantees that
+ But given that CPU~0's load from~B happened before CPU~1's store
+ to~B, memory-barrier pairing guarantees that CPU~0's store to~A
+ happens before CPU~1's load from~A, which in turn guarantees that
\co{r2} will be equal to one, not zero.
Therefore, at least one of \co{r1} and \co{r2} must be nonzero,
@@ -777,8 +777,8 @@ keep in mind that they used to be a \emph{lot} harder on some systems!
Unfortunately, although this conclusion is correct on
21\textsuperscript{st}-century systems, it does not necessarily hold
on all antique 20\textsuperscript{th}-century systems.
- Suppose that the cache line containing A is initially owned
- by CPU~2, and that containing B is initially owned by CPU~1.
+ Suppose that the cache line containing~A is initially owned
+ by CPU~2, and that containing~B is initially owned by CPU~1.
Then, in systems that have invalidation queues and store
buffers, it is possible for the first assignments to ``pass
in the night'', so that the second assignments actually
@@ -812,10 +812,10 @@ these combinations in order to fully understand how this works.
(ignoring MMIO registers for the moment),
it is not possible for one of the loads to see the
results of the other load.
- However, if we know that CPU~2's load from B returned a
- newer value than CPU~1's load from B, then we also know
- that CPU~2's load from A returned either the same value
- as CPU~1's load from A or some later value.
+ However, if we know that CPU~2's load from~B returned a
+ newer value than CPU~1's load from~B, then we also know
+ that CPU~2's load from~A returned either the same value
+ as CPU~1's load from~A or some later value.
\paragraph{Mouth to Mouth, Ear to Ear.}
One of the variables is only loaded from, and the other
@@ -826,12 +826,12 @@ these combinations in order to fully understand how this works.
provided by the memory barrier.
However, it is possible to determine which store happened
- last, but this requires an additional load from B.
- If this additional load from B is executed after both
+ last, but this requires an additional load from~B.
+ If this additional load from~B is executed after both
CPUs~1 and~2 complete, and if it turns out that CPU~2's
- store to B happened last, then we know
- that CPU~2's load from A returned either the same value
- as CPU~1's load from A or some later value.
+ store to~B happened last, then we know
+ that CPU~2's load from~A returned either the same value
+ as CPU~1's load from~A or some later value.
\paragraph{Only One Store.}
Because there is only one store, only one of the variables
@@ -843,28 +843,28 @@ these combinations in order to fully understand how this works.
At least not straightforwardly.
But suppose that in combination~1 from
Table~\ref{tab:advsync:Memory-Barrier Combinations},
- CPU~1's load from A returns the value that CPU~2 stored
- to A. Then we know that CPU~1's load from B returned
- either the same value as CPU~2's load from A or some later value.
+ CPU~1's load from~A returns the value that CPU~2 stored
+ to~A. Then we know that CPU~1's load from~B returned
+ either the same value as CPU~2's load from~A or some later value.
\QuickQuiz{}
How can the other ``Only one store'' entries in
Table~\ref{tab:advsync:Memory-Barrier Combinations}
be used?
\QuickQuizAnswer{
- For combination~2, if CPU~1's load from B sees a value prior
- to CPU~2's store to B, then we know that CPU~2's load from A
- will return the same value as CPU~1's load from A, or some later
+ For combination~2, if CPU~1's load from~B sees a value prior
+ to CPU~2's store to~B, then we know that CPU~2's load from~A
+ will return the same value as CPU~1's load from~A, or some later
value.
- For combination~4, if CPU~2's load from B sees the value from
- CPU~1's store to B, then we know that CPU~2's load from A
- will return the same value as CPU~1's load from A, or some later
+ For combination~4, if CPU~2's load from~B sees the value from
+ CPU~1's store to~B, then we know that CPU~2's load from~A
+ will return the same value as CPU~1's load from~A, or some later
value.
- For combination~8, if CPU~2's load from A sees CPU~1's store
- to A, then we know that CPU~1's load from B will return the same
- value as CPU~2's load from A, or some later value.
+ For combination~8, if CPU~2's load from~A sees CPU~1's store
+ to~A, then we know that CPU~1's load from~B will return the same
+ value as CPU~2's load from~A, or some later value.
} \QuickQuizEnd
\subsubsection{Semantics Sufficient to Implement Locking}
@@ -921,7 +921,7 @@ assert(b == 2);
\QuickQuizAnswer{
If the CPU is not required to see all of its loads and
stores in order, then the {\tt b=1+a} might well see an
- old version of the variable ``a''.
+ old version of the variable~``a''.
This is why it is so very important that each CPU or thread
see all of its own loads and stores in program order.
@@ -1090,28 +1090,28 @@ a few simple rules:
for example, if a given CPU never loaded or stored
the shared variable, then it can have no opinion about
that variable's value.}
-\item If one CPU does ordered stores to variables A and B,\footnote{
- For example, by executing the store to A, a
- memory barrier, and then the store to B.}
- and if a second CPU does ordered loads from B and A,\footnote{
- For example, by executing the load from B, a
- memory barrier, and then the load from A.}
- then if the second CPU's load from B gives the value stored
- by the first CPU, then the second CPU's load from A must
+\item If one CPU does ordered stores to variables~A and~B,\footnote{
+ For example, by executing the store to~A, a
+ memory barrier, and then the store to~B.}
+ and if a second CPU does ordered loads from~B and~A,\footnote{
+ For example, by executing the load from~B, a
+ memory barrier, and then the load from~A.}
+ then if the second CPU's load from~B gives the value stored
+ by the first CPU, then the second CPU's load from~A must
give the value stored by the first CPU.
-\item If one CPU does a load from A ordered before a store to B,
- and if a second CPU does a load from B ordered before a store to A,
- and if the second CPU's load from B gives the value stored by
- the first CPU, then the first CPU's load from A must \emph{not}
+\item If one CPU does a load from~A ordered before a store to~B,
+ and if a second CPU does a load from~B ordered before a store to~A,
+ and if the second CPU's load from~B gives the value stored by
+ the first CPU, then the first CPU's load from~A must \emph{not}
give the value stored by the second CPU.
-\item If one CPU does a load from A ordered before a store to B,
- and if a second CPU does a store to B ordered before a
- store to A, and if the first CPU's load from A gives
+\item If one CPU does a load from~A ordered before a store to~B,
+ and if a second CPU does a store to~B ordered before a
+ store to~A, and if the first CPU's load from~A gives
the value stored by the second CPU, then the first CPU's
- store to B must happen after the second CPU's store to B,
+ store to~B must happen after the second CPU's store to~B,
hence the value stored by the first CPU persists.\footnote{
Or, for the more competitively oriented, the first
- CPU's store to B ``wins''.}
+ CPU's store to~B ``wins''.}
\end{enumerate}
The next section takes a more operational view of these rules.
@@ -1142,7 +1142,7 @@ interface between the CPU and rest of the system (the dotted lines).
For example, consider the following sequence of events given the
-initial values {\tt \{A = 1, B = 2\}}:
+initial values {\tt \{A~=~1, B~=~2\}}:
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -1199,7 +1199,7 @@ perceived by the loads made by another CPU in the same order as the stores were
committed.
As a further example, consider this sequence of events given the
-initial values {\tt \{A = 1, B = 2, C = 3, P = \&A, Q = \&C\}}:
+initial values {\tt \{A~=~1, B~=~2, C~=~3, P~=~\&A, Q~=~\&C\}}:
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -1215,8 +1215,8 @@ initial values {\tt \{A = 1, B = 2, C = 3, P = \&A, Q = \&C\}}:
\vspace{5pt}
There is an obvious data dependency here,
-as the value loaded into \co{D} depends on
-the address retrieved from \co{P} by CPU~2.
+as the value loaded into~\co{D} depends on
+the address retrieved from~\co{P} by CPU~2.
At the end of the sequence, any of the
following results are possible:
@@ -1232,8 +1232,8 @@ following results are possible:
\end{minipage}
\vspace{5pt}
-Note that CPU~2 will never try and load C into D because the CPU will load P
-into Q before issuing the load of *Q.
+Note that CPU~2 will never try and load~C into~D because the CPU will load~P
+into~Q before issuing the load of~*Q.
\subsection{Device Operations}
\label{sec:advsync:Device Operations}
@@ -1241,8 +1241,8 @@ into Q before issuing the load of *Q.
Some devices present their control interfaces as collections of memory
locations, but the order in which the control registers are accessed is very
important. For instance, imagine an Ethernet card with a set of internal
-registers that are accessed through an address port register (A) and a data
-port register (D). To read internal register 5, the following code might then
+registers that are accessed through an address port register~(A) and a data
+port register~(D). To read internal register~5, the following code might then
be used:
\vspace{5pt}
@@ -1594,7 +1594,7 @@ Section~\ref{sec:advsync:Device Operations}).
\QuickQuiz{}
What effect does the following sequence have on the
- order of stores to variables ``a'' and ``b''?
+ order of stores to variables~``a'' and~``b''?
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -1607,9 +1607,9 @@ Section~\ref{sec:advsync:Device Operations}).
\end{minipage}
\QuickQuizAnswer{
Absolutely none. This barrier {\em would} ensure that the
- assignments to ``a'' and ``b'' happened before any subsequent
+ assignments to~``a'' and~``b'' happened before any subsequent
assignments, but it does nothing to enforce any order of
- assignments to ``a'' and ``b'' themselves.
+ assignments to~``a'' and~``b'' themselves.
} \QuickQuizEnd
\subsubsection{What May Not Be Assumed About Memory Barriers?}
@@ -1652,7 +1652,7 @@ of the confines of a given architecture:
The usage requirements of data dependency barriers are a little subtle, and
it's not always obvious that they're needed. To illustrate, consider the
following sequence of events, with initial values
-{\tt \{A = 1, B = 2, C = 3, P = \&A, Q = \&C\}}:
+{\tt \{A~=~1, B~=~2, C~=~3, P~=~\&A, Q~=~\&C\}}:
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -1671,8 +1671,8 @@ following sequence of events, with initial values
\vspace{5pt}
There's a clear data dependency here, and it would seem intuitively
-obvious that by the end of the sequence, \co{Q} must be either \co{&A}
-or \co{&B}, and that:
+obvious that by the end of the sequence, \co{Q}~must be either~\co{&A}
+or~\co{&B}, and that:
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -1686,8 +1686,8 @@ or \co{&B}, and that:
\vspace{5pt}
Counter-intuitive though it might be, it is quite possible that
-CPU~2's perception of \co{P} might be updated \emph{before} its perception
-of \co{B}, thus leading to the following situation:
+CPU~2's perception of~\co{P} might be updated \emph{before} its perception
+of~\co{B}, thus leading to the following situation:
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -1705,7 +1705,7 @@ Alpha).
To deal with this, a data dependency barrier must be inserted between the
address load and the data load (again with initial values of
-{\tt \{A = 1, B = 2, C = 3, P = \&A, Q = \&C\}}):
+{\tt \{A~=~1, B~=~2, C~=~3, P~=~\&A, Q~=~\&C\}}):
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -1731,17 +1731,17 @@ Note that this extremely counterintuitive situation arises most easily on
machines with split caches, so that, for example, one cache bank processes
even-numbered cache lines and the other bank processes odd-numbered cache
lines.
-The pointer \co{P} might be stored in an odd-numbered cache line, and the
-variable \co{B} might be stored in an even-numbered cache line. Then, if the
+The pointer~\co{P} might be stored in an odd-numbered cache line, and the
+variable~\co{B} might be stored in an even-numbered cache line. Then, if the
even-numbered bank of the reading CPU's cache is extremely busy while the
odd-numbered bank is idle, one can see the new value of the
-pointer \co{P} (which is \co{&B}),
-but the old value of the variable \co{B} (which is 2).
+pointer~\co{P} (which is~\co{&B}),
+but the old value of the variable~\co{B} (which is~2).
Another example of where data dependency barriers might by required is where a
number is read from memory and then used to calculate the index for an array
access with initial values
-{\tt \{M[0] = 1, M[1] = 2, M[3] = 3, P = 0, Q = 3\}}:
+{\tt \{M[0]~=~1, M[1]~=~2, M[3]~=~3, P~=~0, Q~=~3\}}:
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -1799,7 +1799,7 @@ Consider the following bit of code:
This will not have the desired effect because there is no actual data
dependency, but rather a control dependency that the CPU may short-circuit
by attempting to predict the outcome in advance, so that other CPUs see
-the load from \co{y} as having happened before the load from \co{x}.
+the load from~\co{y} as having happened before the load from~\co{x}.
In such a case what's actually required is:
\vspace{5pt}
@@ -1834,13 +1834,13 @@ Control dependencies pair normally with other types of barriers.
That said, please note that neither \co{READ_ONCE()} nor \co{WRITE_ONCE()}
are optional!
Without the \co{READ_ONCE()}, the compiler might combine the load
-from \co{x} with other loads from \co{x}.
-Without the \co{WRITE_ONCE()}, the compiler might combine the store to
-\co{y} with other stores to \co{y}.
+from~\co{x} with other loads from~\co{x}.
+Without the \co{WRITE_ONCE()}, the compiler might combine the store
+to~\co{y} with other stores to~\co{y}.
Either can result in highly counterintuitive effects on ordering.
Worse yet, if the compiler is able to prove (say) that the value of
-variable \co{x} is always non-zero, it would be well within its rights
+variable~\co{x} is always non-zero, it would be well within its rights
to optimize the original example by eliminating the ``\co{if}'' statement
as follows:
@@ -1894,8 +1894,8 @@ optimization levels:
\end{minipage}
\vspace{5pt}
-Now there is no conditional between the load from \co{x} and the store to
-\co{y}, which means that the CPU is within its rights to reorder them:
+Now there is no conditional between the load from~\co{x} and the store
+to~\co{y}, which means that the CPU is within its rights to reorder them:
The conditional is absolutely required, and must be present in the
assembly code even after all compiler optimizations have been applied.
Therefore, if you need ordering in this example, you need explicit
@@ -1918,9 +1918,9 @@ memory barriers, for example, a release store:
\vspace{5pt}
The initial \co{READ_ONCE()} is still required to prevent the compiler from
-proving the value of \co{x}.
+proving the value of~\co{x}.
-In addition, you need to be careful what you do with the local variable
+In addition, you need to be careful what you do with the local variable~%
\co{q},
otherwise the compiler might be able to guess the value and again remove
the needed conditional.
@@ -1942,7 +1942,7 @@ For example:
\end{minipage}
\vspace{5pt}
-If \co{MAX} is defined to be 1, then the compiler knows that \co{(q\%MAX)} is
+If \co{MAX} is defined to be~1, then the compiler knows that \co{(q\%MAX)} is
equal to zero, in which case the compiler is within its rights to
transform the above code into the following:
@@ -1958,7 +1958,7 @@ transform the above code into the following:
\vspace{5pt}
Given this transformation, the CPU is not required to respect the ordering
-between the load from variable \co{x} and the store to variable \co{y}.
+between the load from variable~\co{x} and the store to variable~\co{y}.
It is tempting to add a \co{barrier()} to constrain the compiler,
but this does not help.
The conditional is gone, and the barrier won't bring it back.
@@ -1982,7 +1982,7 @@ that \co{MAX} is greater than one, perhaps as follows:
\end{minipage}
\vspace{5pt}
-Please note once again that the stores to \co{y} differ.
+Please note once again that the stores to~\co{y} differ.
If they were identical, as noted earlier, the compiler could pull this
store outside of the ``\co{if}'' statement.
@@ -2042,9 +2042,9 @@ not necessarily apply to code following the if-statement:
It is tempting to argue that there in fact is ordering because the
compiler cannot reorder volatile accesses and also cannot reorder
-the writes to \co{y} with the condition.
+the writes to~\co{y} with the condition.
Unfortunately for this line
-of reasoning, the compiler might compile the two writes to \co{y} as
+of reasoning, the compiler might compile the two writes to~\co{y} as
conditional-move instructions, as in this fanciful pseudo-assembly
language:
@@ -2063,7 +2063,7 @@ language:
\vspace{5pt}
A weakly ordered CPU would have no dependency of any sort between the load
-from \co{x} and the store to \co{z}.
+from~\co{x} and the store to~\co{z}.
The control dependencies would extend
only to the pair of cmov instructions and the store depending on them.
In short, control dependencies apply only to the stores in the ``\co{then}''
@@ -2071,8 +2071,8 @@ and ``\co{else}'' of the ``\co{if}'' in question (including functions
invoked by those two clauses), not to code following that ``\co{if}''.
Finally, control dependencies do \emph{not} provide transitivity.
-This is demonstrated by two related examples, with the initial values of
-\co{x} and \co{y} both being zero:
+This is demonstrated by two related examples, with the initial values
+of~\co{x} and~\co{y} both being zero:
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -2114,7 +2114,7 @@ not), then adding the following CPU would guarantee a related assertion:
But because control dependencies do \emph{not} provide transitivity, the above
assertion can fail after the combined three-CPU example completes.
If you need the three-CPU example to provide ordering, you will need
-\co{smp_mb()} between the loads and stores in the CPU 0 and CPU 1 code
+\co{smp_mb()} between the loads and stores in the CPU~0 and CPU~1 code
fragments, that is, just before or just after the ``\co{if}'' statements.
Furthermore, the original two-CPU example is very fragile and should be avoided.
@@ -2270,7 +2270,7 @@ Figure~\ref{fig:advsync:Write Barrier Ordering Semantics}.
Secondly, data dependency barriers act as partial orderings on data-dependent
loads. Consider the following sequence of events with initial values
-{\tt \{B = 7, X = 9, Y = 8, C = \&Y\}}:
+{\tt \{B~=~7, X~=~9, Y~=~8, C~=~\&Y\}}:
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -2300,13 +2300,13 @@ shown in Figure~\ref{fig:advsync:Data Dependency Barrier Omitted}.
\ContributedBy{Figure}{fig:advsync:Data Dependency Barrier Omitted}{David Howells}
\end{figure*}
-In the above example, CPU~2 perceives that \co{B} is 7,
-despite the load of \co{*C}
-(which would be \co{B}) coming after the \co{LOAD} of \co{C}.
+In the above example, CPU~2 perceives that \co{B} is~7,
+despite the load of~\co{*C}
+(which would be~\co{B}) coming after the \co{LOAD} of~\co{C}.
-If, however, a data dependency barrier were to be placed between the load of
-\co{C} and the load of \co{*C} (i.e.: \co{B}) on CPU~2, again with initial
-values of {\tt \{B = 7, X = 9, Y = 8, C = \&Y\}}:
+If, however, a data dependency barrier were to be placed between the load
+of~\co{C} and the load of~\co{*C} (i.e.:~\co{B}) on CPU~2, again with initial
+values of {\tt \{B~=~7, X~=~9, Y~=~8, C~=~\&Y\}}:
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -2338,7 +2338,7 @@ Figure~\ref{fig:advsync:Data Dependency Barrier Supplied}.
And thirdly, a read barrier acts as a partial order on loads. Consider the
following sequence of events, with initial values
-{\tt \{A = 0, B = 9\}}:
+{\tt \{A~=~0, B~=~9\}}:
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -2367,9 +2367,9 @@ shown in Figure~\ref{fig:advsync:Read Barrier Needed}.
\ContributedBy{Figure}{fig:advsync:Read Barrier Needed}{David Howells}
\end{figure*}
-If, however, a read barrier were to be placed between the load of \co{B}
-and the load of \co{A} on CPU~2, again with initial values of
-{\tt \{A = 0, B = 9\}}:
+If, however, a read barrier were to be placed between the load of~\co{B}
+and the load of~\co{A} on CPU~2, again with initial values of
+{\tt \{A~=~0, B~=~9\}}:
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -2400,9 +2400,9 @@ Figure~\ref{fig:advsync:Read Barrier Supplied}.
\end{figure*}
To illustrate this more completely, consider what could happen if the code
-contained a load of \co{A} either side of the read barrier, once again
+contained a load of~\co{A} either side of the read barrier, once again
with the same initial values of
-{\tt \{A = 0, B = 9\}}:
+{\tt \{A~=~0, B~=~9\}}:
\vspace{5pt}
\begin{minipage}[t]{\columnwidth}
@@ -2422,8 +2422,8 @@ with the same initial values of
\end{minipage}
\vspace{5pt}
-Even though the two loads of \co{A}
-both occur after the load of \co{B}, they may both
+Even though the two loads of~\co{A}
+both occur after the load of~\co{B}, they may both
come up with different values, as shown in
Figure~\ref{fig:advsync:Read Barrier Supplied, Double Load}.
@@ -2434,7 +2434,7 @@ Figure~\ref{fig:advsync:Read Barrier Supplied, Double Load}.
\ContributedBy{Figure}{fig:advsync:Read Barrier Supplied, Double Load}{David Howells}
\end{figure*}
-Of course, it may well be that CPU~1's update to \co{A} becomes perceptible
+Of course, it may well be that CPU~1's update to~\co{A} becomes perceptible
to CPU~2 before the read barrier completes, as shown in
Figure~\ref{fig:advsync:Read Barrier Supplied, Take Two}.
@@ -2445,11 +2445,11 @@ Figure~\ref{fig:advsync:Read Barrier Supplied, Take Two}.
\ContributedBy{Figure}{fig:advsync:Read Barrier Supplied, Take Two}{David Howells}
\end{figure*}
-The guarantee is that the second load will always come up with \co{A == 1}
+The guarantee is that the second load will always come up with \nbco{A == 1}
if the
-load of \co{B} came up with \co{B == 2}.
-No such guarantee exists for the first load of
-\co{A}; that may come up with either \co{A == 0} or \co{A == 1}.
+load of~\co{B} came up with \nbco{B == 2}.
+No such guarantee exists for the first load
+of~\co{A}; that may come up with either \nbco{A == 0} or \nbco{A == 1}.
\subsubsection{Read Memory Barriers vs. Load Speculation}
\label{sec:advsync:Read Memory Barriers vs. Load Speculation}
@@ -2484,7 +2484,7 @@ For example, consider the following:
On some CPUs, divide instructions can take a long time to complete,
which means that CPU~2's bus might go idle during that time.
-CPU~2 might therefore speculatively load \co{A} before the divides
+CPU~2 might therefore speculatively load~\co{A} before the divides
complete.
In the (hopefully) unlikely event of an exception from one of the dividees,
this speculative load will have been wasted, but in the (again, hopefully)
@@ -2523,9 +2523,9 @@ dependent on the type of barrier used. If there was no change made to the
speculated memory location, then the speculated value will just be used,
as shown in
Figure~\ref{fig:advsync:Speculative Loads and Barrier}.
-On the other hand, if there was an update or invalidation to \co{A}
+On the other hand, if there was an update or invalidation to~\co{A}
from some other CPU, then the speculation will be cancelled and the
-value of \co{A} will be reloaded,
+value of~\co{A} will be reloaded,
as shown in Figure~\ref{fig:advsync:Speculative Loads Cancelled by Barrier}.
\begin{figure*}[htbp]
@@ -2899,8 +2899,8 @@ operations concurrently:
Given that operations grouped in curly braces are executed
concurrently, which of the rows of
Table~\ref{tab:advsync:Lock-Based Critical Sections}
- are legitimate reorderings of the assignments to variables
- ``A'' through ``F'' and the ACQUIRE/RELEASE operations?
+ are legitimate reorderings of the assignments to variables~``A''
+ through~`F'' and the ACQUIRE/RELEASE operations?
(The order in the code is {\tt *A, *B, ACQUIRE, *C, *D, RELEASE, *E, *F}.)
Why or why not?
\QuickQuizAnswer{
@@ -2908,18 +2908,18 @@ operations concurrently:
\item Legitimate, executed in order.
\item Legitimate, the lock acquisition was executed concurrently
with the last assignment preceding the critical section.
- \item Illegitimate, the assignment to ``F'' must follow the ACQUIRE
+ \item Illegitimate, the assignment to~``F'' must follow the ACQUIRE
operation.
\item Illegitimate, the ACQUIRE must complete before any operation in
the critical section. However, the RELEASE may legitimately
be executed concurrently with subsequent operations.
- \item Legitimate, the assignment to ``A'' precedes the RELEASE,
+ \item Legitimate, the assignment to~``A'' precedes the RELEASE,
as required, and all other operations are in order.
- \item Illegitimate, the assignment to ``C'' must follow the ACQUIRE.
- \item Illegitimate, the assignment to ``D'' must precede the RELEASE.
+ \item Illegitimate, the assignment to~``C'' must follow the ACQUIRE.
+ \item Illegitimate, the assignment to~``D'' must precede the RELEASE.
\item Legitimate, all assignments are ordered with respect to the
ACQUIRE and RELEASE operations.
- \item Illegitimate, the assignment to ``A'' must precede the RELEASE.
+ \item Illegitimate, the assignment to~``A'' must precede the RELEASE.
\end{enumerate}
} \QuickQuizEnd
@@ -2928,7 +2928,7 @@ Code containing multiple locks still sees ordering constraints from
those locks, but one must be careful to keep track of which lock is which.
For example, consider the code shown in
Table~\ref{tab:advsync:Ordering With Multiple Locks}, which uses
-a pair of locks named ``M'' and ``Q''.
+a pair of locks named~``M'' and~``Q''.
\begin{table}[htbp]
\scriptsize\centering{\tt
@@ -2947,7 +2947,7 @@ a pair of locks named ``M'' and ``Q''.
\end{table}
In this example, there are no guarantees as to what order the
-assignments to variables ``A'' through ``H'' will appear in, other
+assignments to variables~``A'' through~``H'' will appear in, other
than the constraints imposed by the locks themselves, as
described in the previous section.
@@ -2986,11 +2986,11 @@ Table~\ref{tab:advsync:Ordering With Multiple CPUs on One Lock}?
\label{tab:advsync:Ordering With Multiple CPUs on One Lock}
\end{table}
-In this case, either CPU~1 acquires M before CPU~2 does, or vice versa.
-In the first case, the assignments to A, B, and C must precede
-those to F, G, and H.
+In this case, either CPU~1 acquires~M before CPU~2 does, or vice versa.
+In the first case, the assignments to~A, B, and~C must precede
+those to~F, G, and~H.
On the other hand, if CPU~2 acquires the lock first, then the
-assignments to E, F, and G must precede those to B, C, and D.
+assignments to~E, F, and~G must precede those to~B, C, and~D.
\subsection{The Effects of the CPU Cache}
\label{sec:advsync:The Effects of the CPU Cache}
@@ -3046,9 +3046,9 @@ Figure~\ref{fig:advsync:Split Caches}, in which each CPU has a split
cache.
This system has the following properties:
\begin{enumerate}
-\item An odd-numbered cache line may be in cache A, cache C,
+\item An odd-numbered cache line may be in cache~A, cache~C,
in memory, or some combination of the above.
-\item An even-numbered cache line may be in cache B, cache D,
+\item An even-numbered cache line may be in cache~B, cache~D,
in memory, or some combination of the above.
\item While the CPU core is interrogating one of its caches,\footnote{
But note that in ``superscalar'' systems, the CPU
@@ -3067,7 +3067,7 @@ This system has the following properties:
stores to cache lines affected by entries in those queues.
\end{enumerate}
-In short, if cache A is busy, but cache B is idle, then CPU~1's
+In short, if cache~A is busy, but cache~B is idle, then CPU~1's
stores to odd-numbered cache lines may be delayed compared to
CPU~2's stores to even-numbered cache lines.
In not-so-extreme cases, CPU~2 may see CPU~1's operations out
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH 09/12] advsync: Move footnote on transitivity forward
2017-04-01 2:09 [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Akira Yokosawa
` (7 preceding siblings ...)
2017-04-01 2:19 ` [RFC PATCH 08/12] advsync: Properly use nbsp Akira Yokosawa
@ 2017-04-01 2:21 ` Akira Yokosawa
2017-04-01 2:22 ` [RFC PATCH 10/12] advsync: Fix line number called out Akira Yokosawa
` (3 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Akira Yokosawa @ 2017-04-01 2:21 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From f747b77a00951868bb2ed54cc92fd25e6eb06f31 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sun, 26 Mar 2017 17:00:35 +0900
Subject: [RFC PATCH 09/12] advsync: Move footnote on transitivity forward
The footnote appended in commit cc477e60a0ff ("advsync: Add
footnote on transitivity") should be where "transitivity" is
first mentioned in the section.
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
advsync/memorybarriers.tex | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/advsync/memorybarriers.tex b/advsync/memorybarriers.tex
index 173d7ad..942c430 100644
--- a/advsync/memorybarriers.tex
+++ b/advsync/memorybarriers.tex
@@ -2070,7 +2070,9 @@ In short, control dependencies apply only to the stores in the ``\co{then}''
and ``\co{else}'' of the ``\co{if}'' in question (including functions
invoked by those two clauses), not to code following that ``\co{if}''.
-Finally, control dependencies do \emph{not} provide transitivity.
+Finally, control dependencies do \emph{not} provide transitivity.\footnote{
+ Refer to Section~\ref{sec:advsync:Transitivity} for
+ the meaning of transitivity.}
This is demonstrated by two related examples, with the initial values
of~\co{x} and~\co{y} both being zero:
@@ -2166,9 +2168,7 @@ The following list of rules summarizes the lessons of this section:
\item Control dependencies pair normally with other types of barriers.
-\item Control dependencies do \emph{not} provide transitivity.\footnote{
- Refer to Section~\ref{sec:advsync:Transitivity} for
- the meaning of transitivity.}
+\item Control dependencies do \emph{not} provide transitivity.
If you need transitivity, use \co{smp_mb()}.
\end{enumerate}
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH 10/12] advsync: Fix line number called out
2017-04-01 2:09 [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Akira Yokosawa
` (8 preceding siblings ...)
2017-04-01 2:21 ` [RFC PATCH 09/12] advsync: Move footnote on transitivity forward Akira Yokosawa
@ 2017-04-01 2:22 ` Akira Yokosawa
2017-04-01 2:23 ` [RFC PATCH 11/12] advsync: Add extdash shortcut Akira Yokosawa
` (2 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Akira Yokosawa @ 2017-04-01 2:22 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From 8eed2299c472dff87047577d847de98b0ac0ba67 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sun, 26 Mar 2017 17:19:07 +0900
Subject: [RFC PATCH 10/12] advsync: Fix line number called out
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
advsync/memorybarriers.tex | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/advsync/memorybarriers.tex b/advsync/memorybarriers.tex
index 942c430..99593a6 100644
--- a/advsync/memorybarriers.tex
+++ b/advsync/memorybarriers.tex
@@ -250,7 +250,7 @@ synchronized among all CPUs (not available from all CPU architectures,
unfortunately!), and the loop from lines~3-8 records the length of
time that the variable retains the value that this CPU assigned to it.
Of course, one of the CPUs will ``win'', and would thus never exit
-the loop if not for the check on lines~6-8.
+the loop if not for the check on lines~6-7.
\QuickQuiz{}
What assumption is the code fragment
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH 11/12] advsync: Add extdash shortcut
2017-04-01 2:09 [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Akira Yokosawa
` (9 preceding siblings ...)
2017-04-01 2:22 ` [RFC PATCH 10/12] advsync: Fix line number called out Akira Yokosawa
@ 2017-04-01 2:23 ` Akira Yokosawa
2017-04-01 2:23 ` [RFC PATCH 12/12] advsync: Avoid indent after minipage Akira Yokosawa
2017-04-04 16:02 ` [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Paul E. McKenney
12 siblings, 0 replies; 14+ messages in thread
From: Akira Yokosawa @ 2017-04-01 2:23 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From 30b53cae7cbbb00daef6c6a979e6e5d39680b9b8 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sat, 1 Apr 2017 08:48:26 +0900
Subject: [RFC PATCH 11/12] advsync: Add extdash shortcut
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
advsync/memorybarriers.tex | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/advsync/memorybarriers.tex b/advsync/memorybarriers.tex
index 99593a6..d2ad7b8 100644
--- a/advsync/memorybarriers.tex
+++ b/advsync/memorybarriers.tex
@@ -2828,7 +2828,7 @@ respectively.
} \QuickQuizEnd
\paragraph{Lock-Based Critical Sections:}
-Although an ACQUIRE-RELEASE pair does not act as a full memory barrier,
+Although an ACQUIRE\-/RELEASE pair does not act as a full memory barrier,
these operations \emph{do} affect memory ordering.
Consider the following code:
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH 12/12] advsync: Avoid indent after minipage
2017-04-01 2:09 [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Akira Yokosawa
` (10 preceding siblings ...)
2017-04-01 2:23 ` [RFC PATCH 11/12] advsync: Add extdash shortcut Akira Yokosawa
@ 2017-04-01 2:23 ` Akira Yokosawa
2017-04-04 16:02 ` [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Paul E. McKenney
12 siblings, 0 replies; 14+ messages in thread
From: Akira Yokosawa @ 2017-04-01 2:23 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
From 8aa4157fbed720fd3e8a259de620184003bf09da Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Sat, 1 Apr 2017 09:06:05 +0900
Subject: [RFC PATCH 12/12] advsync: Avoid indent after minipage
Also reduce \vspace{} param for even spacing in the result.
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
advsync/memorybarriers.tex | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/advsync/memorybarriers.tex b/advsync/memorybarriers.tex
index d2ad7b8..5f8ea7f 100644
--- a/advsync/memorybarriers.tex
+++ b/advsync/memorybarriers.tex
@@ -2787,9 +2787,9 @@ For example, the following:
3 RELEASE
4 *B = b;
\end{verbatim}
+\vspace{1pt}
\end{minipage}
-\vspace{5pt}
-
+%
might well execute in the following order:
\vspace{5pt}
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE
2017-04-01 2:09 [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Akira Yokosawa
` (11 preceding siblings ...)
2017-04-01 2:23 ` [RFC PATCH 12/12] advsync: Avoid indent after minipage Akira Yokosawa
@ 2017-04-04 16:02 ` Paul E. McKenney
12 siblings, 0 replies; 14+ messages in thread
From: Paul E. McKenney @ 2017-04-04 16:02 UTC (permalink / raw)
To: Akira Yokosawa; +Cc: perfbook
On Sat, Apr 01, 2017 at 11:09:27AM +0900, Akira Yokosawa wrote:
> >From 8aa4157fbed720fd3e8a259de620184003bf09da Mon Sep 17 00:00:00 2001
> From: Akira Yokosawa <akiyks@gmail.com>
> Date: Sat, 1 Apr 2017 10:48:45 +0900
> Subject: [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE
>
> Hi Paul,
>
> This series mostly corresponds to commit 2e4f5382d12a ("locking/doc:
> Rename LOCK/UNLOCK to ACQUIRE/RELEASE") in Linux kernel repository.
> Although Documentation/memory-barriers.txt has a lot of updated
> contents after the import to perfbook, this series basically
> does substitution of existing "LOCK/UNLOCK".
>
> Patches 1--3 do simple substitutions.
> Patch 4 does another replacement not included in the original commit.
> Patch 5 is an attempt to add a footnote on "LOCK/UNLOCK" wording. You
> may want to rewrite the text of the footnote.
> Patch 6 replaces some of "the" with "a/an". I think they are reasonable,
> but this is from an non-native POV. I might missing something.
> Patch 7 is to go along with current wording in memory-barriers.txt.
> Patch 8 is a big patch in line count, but it just adds proper nbsps
> (in LaTeX sense).
> Patch 9 adjusts the position of a footnote appended in the previous
> patch set.
> Patch 10 is an independent trivial typo fix.
> Patches 11 and 12 are tweaks for a better layout.
Queued and pushed, thank you!
Thanx, Paul
> Thanks, Akira
> --
> Akira Yokosawa (12):
> advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 1)
> advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 2)
> advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 3)
> advsync: More replacement to ACQUIRE
> advsync: Add footnote mentioning LOCK/UNLOCK wording
> advsync: Modify usage of definite article
> advsync: Substitute 'guarantee' with 'implication'
> advsync: Properly use nbsp
> advsync: Move footnote on transitivity forward
> advsync: Fix line number called out
> advsync: Add extdash shortcut
> advsync: Avoid indent after minipage
>
> advsync/memorybarriers.tex | 504 +++++++++++++++++++++++----------------------
> 1 file changed, 254 insertions(+), 250 deletions(-)
>
> --
> 2.7.4
>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2017-04-04 16:02 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-01 2:09 [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Akira Yokosawa
2017-04-01 2:10 ` [RFC PATCH 01/12] advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 1) Akira Yokosawa
2017-04-01 2:11 ` [RFC PATCH 02/12] advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 2) Akira Yokosawa
2017-04-01 2:12 ` [RFC PATCH 03/12] advsync: LOCK/UNLOCK -> ACQUIRE/RELEASE (part 3) Akira Yokosawa
2017-04-01 2:13 ` [RFC PATCH 04/12] advsync: More replacement to ACQUIRE Akira Yokosawa
2017-04-01 2:14 ` [RFC PATCH 05/12] advsync: Add footnote mentioning LOCK/UNLOCK wording Akira Yokosawa
2017-04-01 2:16 ` [RFC PATCH 06/12] advsync: Modify usage of definite article Akira Yokosawa
2017-04-01 2:17 ` [RFC PATCH 07/12] advsync: Substitute 'guarantee' with 'implication' Akira Yokosawa
2017-04-01 2:19 ` [RFC PATCH 08/12] advsync: Properly use nbsp Akira Yokosawa
2017-04-01 2:21 ` [RFC PATCH 09/12] advsync: Move footnote on transitivity forward Akira Yokosawa
2017-04-01 2:22 ` [RFC PATCH 10/12] advsync: Fix line number called out Akira Yokosawa
2017-04-01 2:23 ` [RFC PATCH 11/12] advsync: Add extdash shortcut Akira Yokosawa
2017-04-01 2:23 ` [RFC PATCH 12/12] advsync: Avoid indent after minipage Akira Yokosawa
2017-04-04 16:02 ` [RFC PATCH 00/12] advsync: Rename LOCK/UNLOCK to ACQUIRE/RELEASE Paul E. McKenney
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.