All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch] futex.7: Semantics section: Race condition in locking semantics description
       [not found] <DM5PR02MB3687609B599F7773193DE31AC4F30@DM5PR02MB3687.namprd02.prod.outlook.com>
@ 2020-12-02 18:07 ` Sudvarg, Marion
  2020-12-02 20:47   ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 4+ messages in thread
From: Sudvarg, Marion @ 2020-12-02 18:07 UTC (permalink / raw)
  To: mtk.manpages; +Cc: linux-man, bert

[-- Attachment #1: Type: text/plain, Size: 2336 bytes --]

Hello Michael,

I apologize if you're receiving this email a second time. I accidentally kept HTML formatting enabled the first time I sent it, causing it to be rejected as spam.

I am teaching the Operating Systems Organization course at Washington University in St. Louis, and to supplement a series of lectures on locking and synchronization, I assigned students to read the futex(7) manual page. One student, Alex Baker <mailto:alexbaker@wustl.edu>, pointed out a race condition in the description of how to "down" a futex, i.e. wait for or acquire a lock, under the Semantics section. Say we have two threads, T0 and T1, which execute as follows:

1. T0 acquires the lock, decrements the futex to 0
2. T1 switches in, attempts to acquire the lock, decrements the futex to -1
3. T0 switches in, completes its critical section
4. T0 unlocks the lock, increments the futex to 0
5. Because the futex is 0, T0 assumes threads are waiting on the futex, but no threads have yet called FUTEX_WAIT
6. T0 sets the futex to 1
7. T0 uses the FUTEX_WAKE operation
8. T1 switches in, and believing it should still wait for the futex, it sets the futex to -1
9. T1 now uses the FUTEX_WAIT operation

Because, in step 8, T1 has set the futex to -1, its call to FUTEX_WAIT in 9 will succeed, as the futex holds the expected value in the call (-1). But since T0 has already completed its execution and has called FUTEX_WAKE, T1 may never be woken.

The fwait and fpost (i.e. lock and release, or down and up) functions in the Examples section on the futex(2) page seem to be race free, but use atomic compare exchange functions, instead of the atomic increment and decrement semantics described in futex(7). 

I've attached a patch for the futex(7) man page, which modifies the Semantics section to describe a correct, race-free use of a futex for lock acquisition using atomic increment and decrement operations. I've also added a code sample to help illustrate this. I hope the addition of the code sample does not, in your opinion, add unnecessary length to this manual page, given that the futex(2) page is already so thorough.

I have copied Bert Hubert, whom I believe to be the original author of the futex(7) man page.

Thank you,

Marion Sudvarg
PhD Student, Computer Science
Washington University in St. Louis

[-- Attachment #2: futex.7.patch --]
[-- Type: application/octet-stream, Size: 5495 bytes --]

diff --git a/man7/futex.7 b/man7/futex.7
index 22f610646..f59725b61 100644
--- a/man7/futex.7
+++ b/man7/futex.7
@@ -72,19 +72,11 @@ operation.
 Waiting on a futex, to "down" it, is the reverse operation.
 Atomically decrement the counter and check if it changed to 0,
 in which case the operation is done and the futex was uncontended.
-In all other circumstances, the process should
-request that the kernel wait for another process to up the futex.
+In all other circumstances, the process should set the counter to \-1
+and request that the kernel wait for another process to up the futex.
 This is done using the
 .B FUTEX_WAIT
-operation,
-which is provided the return value of the atomic decrement operation.
-In the event that another process has modified the value of the futex
-between the atomic decrement and the
-.B FUTEX_WAIT
-operation, this guarantees that the
-.B FUTEX_WAIT
-fails, and the process may try again to "down" the futex.
-.SS 
+operation.
 .PP
 The
 .BR futex (2)
@@ -113,166 +105,6 @@ below.
 This man page illustrates the most common use of the
 .BR futex (2)
 primitives; it is by no means the only one.
-.SH EXAMPLES
-The program below demonstrates the use of futexes in a program where
-threads use a futex to synchronize access to a critical section,
-which increments a global integer variable
-.IR nloops
-(a command-line argument that defaults to 100000 if omitted)
-times.
-After the parallel section,
-the program prints the value of the global variable.
-Upon running this program we see output such as the following:
-.PP
-.in +4n
-.EX
-$ \fB./futex_demo\fP
-Ran with 2 threads
-Each thread incremented global_int 1000000 times
-Final value of global_int: 2000000
-.EE
-.in
-.SS Program source
-\&
-.EX
-/* futex_demo.c
-
-    Usage: futex_demo [nloops]
-                    (Default: 100000)
-
-    Demonstrate the use of futexes in a program where multiple threads
-    use a futex to synchronize access to a global integer variable, which
-    is initialized to 0. The two threads each increment the variable
-    \(aqnloops\(aq times, and employ a synchronization protocol that
-    ensures only one thread can access the global variable at a time.
-
-    We use OpenMP for thread parallelism;
-    therefore, this program must be compiled with the \-fopenmp flag,
-    e.g.:
-
-    gcc futex_demo.c \-o futex_demo \-fopenmp
-
-#define _GNU_SOURCE
-#include <stdio.h>
-#include <errno.h>
-#include <stdatomic.h>
-#include <stdint.h>
-#include <stdlib.h>
-#include <unistd.h>
-#include <sys/syscall.h>
-#include <linux/futex.h>
-#include <omp.h>
-
-#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \e
-                        } while (0)
-
-#define NUM_THREADS 2
-#define LOCKED 0
-#define UNLOCKED 1
-
-static int global_int = 0;
-static uint32_t lock = UNLOCKED;
-
-static int
-futex(uint32_t *uaddr, int futex_op, uint32_t val,
-      const struct timespec *timeout, uint32_t *uaddr2, uint32_t val3)
-{
-    return syscall(SYS_futex, uaddr, futex_op, val,
-                   timeout, uaddr2, val3);
-}
-
-/*  Increments the global integer variable nloops times.
-    Without locking, a race condition may occur. */
-
-static void
-critical_section(int nloops)
-{
-    for (int i = 0; i < nloops; i++) {
-        global_int++;
-    }
-}
-
-
-/*  Attempt to lock the futex pointed to by \(aqfutexp\(aq:
-    The futex value is decremented by 1.
-    If the futex value is now LOCKED,
-    the lock was successfully acquired.
-    Otherwise, wait for the lock to be released. */
-
-static void
-flock(uint32_t * futexp)
-{
-    int s;
-    int futex_val;
-
-    /* Attempt to acquire the lock */
-    while ( (futex_val = __atomic_sub_fetch(futexp, 1, __ATOMIC_ACQ_REL)) < LOCKED ) {
-
-        /* If the lock is not available, wait */
-
-        s = futex(futexp, FUTEX_WAIT, futex_val, NULL, NULL, 0);
-        if (s == \-1 && errno != EAGAIN)
-            errExit("futex\-FUTEX_WAIT");
-    }
-}
-
-/*  Unlock the futex pointed to by \(aqfutexp\(aq:
-    The futex value is incremented by 1.
-    
-    If the futex value is now UNLOCKED,
-    no threads are waiting for the lock.
-    Otherwise, another thread is waiting,
-    so set the value to UNLOCKED and wake. */
-
-static void
-funlock(uint32_t * futexp)
-{
-    int s;
-
-    /* Are any threads waiting for the lock? */
-    if (__atomic_add_fetch(futexp, 1, __ATOMIC_ACQ_REL) != UNLOCKED) {
-
-        /* If so, unlock and notify */
-        __atomic_store_n(futexp, UNLOCKED, __ATOMIC_RELEASE);
-        s = futex(futexp, FUTEX_WAKE, 1, NULL, NULL, 0);
-        if (s  == \-1)
-            errExit("futex\-FUTEX_WAKE");
-    }
-}
-
-int
-main(int argc, char *argv[])
-{
-    int nloops;
-    int n_threads;
-
-    nloops = (argc > 1) ? atoi(argv[1]) : 100000;
-
-    //Begin OpenMP parallel section
-    omp_set_num_threads(NUM_THREADS);
-    #pragma omp parallel
-    {
-
-        //Retrieve the actual number of threads
-        if (omp_get_thread_num() == 0) {
-            n_threads = omp_get_num_threads();
-        }
-
-        //Lock and run the critical section
-        flock(&lock);
-        critical_section(nloops);
-        funlock(&lock);
-
-    }
-    
-    printf("Ran with %d threads\n", n_threads);
-    printf("Each thread incremented global_int %d times\n", nloops);
-    printf("Final value of global_int: %d\n", global_int);
-
-    exit(EXIT_SUCCESS);
-}
-
-.EE
 .\" .SH AUTHORS
 .\" .PP
 .\" Futexes were designed and worked on by Hubertus Franke

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [patch] futex.7: Semantics section: Race condition in locking semantics description
  2020-12-02 18:07 ` [patch] futex.7: Semantics section: Race condition in locking semantics description Sudvarg, Marion
@ 2020-12-02 20:47   ` Michael Kerrisk (man-pages)
  2020-12-02 21:04     ` Sudvarg, Marion
  0 siblings, 1 reply; 4+ messages in thread
From: Michael Kerrisk (man-pages) @ 2020-12-02 20:47 UTC (permalink / raw)
  To: Sudvarg, Marion; +Cc: mtk.manpages, linux-man, bert

Hello Marion.

On 12/2/20 7:07 PM, Sudvarg, Marion wrote:
> Hello Michael,
> 
> I apologize if you're receiving this email a second time. I
> accidentally kept HTML formatting enabled the first time I sent it,
> causing it to be rejected as spam.
> 
> I am teaching the Operating Systems Organization course at Washington
> University in St. Louis, and to supplement a series of lectures on
> locking and synchronization, I assigned students to read the futex(7)
> manual page. One student, Alex Baker <mailto:alexbaker@wustl.edu>,
> pointed out a race condition in the description of how to "down" a
> futex, i.e. wait for or acquire a lock, under the Semantics section.
> Say we have two threads, T0 and T1, which execute as follows:
> 
> 1. T0 acquires the lock, decrements the futex to 0
> 2. T1 switches in, attempts to acquire the lock, decrements the futex
> to -1
> 3. T0 switches in, completes its critical section
> 4. T0 unlocks the lock, increments the futex to 0
> 5. Because the futex is 0, T0 assumes threads are waiting on the
> futex, but no threads have yet called FUTEX_WAIT
> 6. T0 sets the futex to 1
> 7. T0 uses the FUTEX_WAKE operation
> 8. T1 switches in, and believing it should still wait for the futex,
> it sets the futex to -1
> 9. T1 now uses the FUTEX_WAIT operation
> 
> Because, in step 8, T1 has set the futex to -1, its call to
> FUTEX_WAIT in 9 will succeed, as the futex holds the expected value
> in the call (-1). But since T0 has already completed its execution
> and has called FUTEX_WAKE, T1 may never be woken.
> 
> The fwait and fpost (i.e. lock and release, or down and up) functions
> in the Examples section on the futex(2) page seem to be race free,
> but use atomic compare exchange functions, instead of the atomic
> increment and decrement semantics described in futex(7).
> 
> I've attached a patch for the futex(7) man page, which modifies the
> Semantics section to describe a correct, race-free use of a futex for
> lock acquisition using atomic increment and decrement operations.
> I've also added a code sample to help illustrate this. I hope the
> addition of the code sample does not, in your opinion, add
> unnecessary length to this manual page, given that the futex(2) page
> is already so thorough.
> 
> I have copied Bert Hubert, whom I believe to be the original author
> of the futex(7) man page.

I have a question: what Linux distro are you using/did you make
this patch from? The code example that your patch is using is _not_
in the upstream page that is part of man-pages.

Regarding your comments on the race above, I find the text of
the manual page a bit unclear, so I'm not sure what kind of
fix should be made. Maybe we are lucky and Bert has a long memory
and replies to this thread.

Thanks,

Michael
 
=====

diff --git a/man7/futex.7 b/man7/futex.7
index 22f610646..f59725b61 100644
--- a/man7/futex.7
+++ b/man7/futex.7
@@ -72,19 +72,11 @@ operation.
 Waiting on a futex, to "down" it, is the reverse operation.
 Atomically decrement the counter and check if it changed to 0,
 in which case the operation is done and the futex was uncontended.
-In all other circumstances, the process should
-request that the kernel wait for another process to up the futex.
+In all other circumstances, the process should set the counter to \-1
+and request that the kernel wait for another process to up the futex.
 This is done using the
 .B FUTEX_WAIT
-operation,
-which is provided the return value of the atomic decrement operation.
-In the event that another process has modified the value of the futex
-between the atomic decrement and the
-.B FUTEX_WAIT
-operation, this guarantees that the
-.B FUTEX_WAIT
-fails, and the process may try again to "down" the futex.
-.SS 
+operation.
 .PP
 The
 .BR futex (2)
@@ -113,166 +105,6 @@ below.
 This man page illustrates the most common use of the
 .BR futex (2)
 primitives; it is by no means the only one.
-.SH EXAMPLES
-The program below demonstrates the use of futexes in a program where
-threads use a futex to synchronize access to a critical section,
-which increments a global integer variable
-.IR nloops
-(a command-line argument that defaults to 100000 if omitted)
-times.
-After the parallel section,
-the program prints the value of the global variable.
-Upon running this program we see output such as the following:
-.PP
-.in +4n
-.EX
-$ \fB./futex_demo\fP
-Ran with 2 threads
-Each thread incremented global_int 1000000 times
-Final value of global_int: 2000000
-.EE
-.in
-.SS Program source
-\&
-.EX
-/* futex_demo.c
-
-    Usage: futex_demo [nloops]
-                    (Default: 100000)
-
-    Demonstrate the use of futexes in a program where multiple threads
-    use a futex to synchronize access to a global integer variable, which
-    is initialized to 0. The two threads each increment the variable
-    \(aqnloops\(aq times, and employ a synchronization protocol that
-    ensures only one thread can access the global variable at a time.
-
-    We use OpenMP for thread parallelism;
-    therefore, this program must be compiled with the \-fopenmp flag,
-    e.g.:
-
-    gcc futex_demo.c \-o futex_demo \-fopenmp
-
-#define _GNU_SOURCE
-#include <stdio.h>
-#include <errno.h>
-#include <stdatomic.h>
-#include <stdint.h>
-#include <stdlib.h>
-#include <unistd.h>
-#include <sys/syscall.h>
-#include <linux/futex.h>
-#include <omp.h>
-
-#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \e
-                        } while (0)
-
-#define NUM_THREADS 2
-#define LOCKED 0
-#define UNLOCKED 1
-
-static int global_int = 0;
-static uint32_t lock = UNLOCKED;
-
-static int
-futex(uint32_t *uaddr, int futex_op, uint32_t val,
-      const struct timespec *timeout, uint32_t *uaddr2, uint32_t val3)
-{
-    return syscall(SYS_futex, uaddr, futex_op, val,
-                   timeout, uaddr2, val3);
-}
-
-/*  Increments the global integer variable nloops times.
-    Without locking, a race condition may occur. */
-
-static void
-critical_section(int nloops)
-{
-    for (int i = 0; i < nloops; i++) {
-        global_int++;
-    }
-}
-
-
-/*  Attempt to lock the futex pointed to by \(aqfutexp\(aq:
-    The futex value is decremented by 1.
-    If the futex value is now LOCKED,
-    the lock was successfully acquired.
-    Otherwise, wait for the lock to be released. */
-
-static void
-flock(uint32_t * futexp)
-{
-    int s;
-    int futex_val;
-
-    /* Attempt to acquire the lock */
-    while ( (futex_val = __atomic_sub_fetch(futexp, 1, __ATOMIC_ACQ_REL)) < LOCKED ) {
-
-        /* If the lock is not available, wait */
-
-        s = futex(futexp, FUTEX_WAIT, futex_val, NULL, NULL, 0);
-        if (s == \-1 && errno != EAGAIN)
-            errExit("futex\-FUTEX_WAIT");
-    }
-}
-
-/*  Unlock the futex pointed to by \(aqfutexp\(aq:
-    The futex value is incremented by 1.
-    
-    If the futex value is now UNLOCKED,
-    no threads are waiting for the lock.
-    Otherwise, another thread is waiting,
-    so set the value to UNLOCKED and wake. */
-
-static void
-funlock(uint32_t * futexp)
-{
-    int s;
-
-    /* Are any threads waiting for the lock? */
-    if (__atomic_add_fetch(futexp, 1, __ATOMIC_ACQ_REL) != UNLOCKED) {
-
-        /* If so, unlock and notify */
-        __atomic_store_n(futexp, UNLOCKED, __ATOMIC_RELEASE);
-        s = futex(futexp, FUTEX_WAKE, 1, NULL, NULL, 0);
-        if (s  == \-1)
-            errExit("futex\-FUTEX_WAKE");
-    }
-}
-
-int
-main(int argc, char *argv[])
-{
-    int nloops;
-    int n_threads;
-
-    nloops = (argc > 1) ? atoi(argv[1]) : 100000;
-
-    //Begin OpenMP parallel section
-    omp_set_num_threads(NUM_THREADS);
-    #pragma omp parallel
-    {
-
-        //Retrieve the actual number of threads
-        if (omp_get_thread_num() == 0) {
-            n_threads = omp_get_num_threads();
-        }
-
-        //Lock and run the critical section
-        flock(&lock);
-        critical_section(nloops);
-        funlock(&lock);
-
-    }
-    
-    printf("Ran with %d threads\n", n_threads);
-    printf("Each thread incremented global_int %d times\n", nloops);
-    printf("Final value of global_int: %d\n", global_int);
-
-    exit(EXIT_SUCCESS);
-}
-
-.EE
 .\" .SH AUTHORS
 .\" .PP
 .\" Futexes were designed and worked on by Hubertus Franke
-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* RE: [patch] futex.7: Semantics section: Race condition in locking semantics description
  2020-12-02 20:47   ` Michael Kerrisk (man-pages)
@ 2020-12-02 21:04     ` Sudvarg, Marion
  2021-02-21 23:02       ` FW: " Sudvarg, Marion
  0 siblings, 1 reply; 4+ messages in thread
From: Sudvarg, Marion @ 2020-12-02 21:04 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages); +Cc: linux-man, bert, Gill, Christopher

[-- Attachment #1: Type: text/plain, Size: 9803 bytes --]

Hello Michael,

Thank you for the fast response!

My mistake; I managed to send you a patch going in the opposite direction: it would reset my changes back to the upstream master. I have attached the correct patch file, which would incorporate the changes I have made into the master branch with origin:

https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git

This is the repository I've been working from; I assume that's the correct, most up-to-date version? 

The code example is my own, and (per the attached, corrected patch) my intention was to add the code example to the manual page to better illustrate the "up" and "down" locking semantics that are described.

Again, my apologies for the mistake. Best,

Marion Sudvarg


-----Original Message-----
From: Michael Kerrisk (man-pages) <mtk.manpages@gmail.com> 
Sent: Wednesday, December 2, 2020 2:47 PM
To: Sudvarg, Marion <msudvarg@wustl.edu>
Cc: mtk.manpages@gmail.com; linux-man@vger.kernel.org; bert@hubertnet.nl
Subject: Re: [patch] futex.7: Semantics section: Race condition in locking semantics description

Hello Marion.

On 12/2/20 7:07 PM, Sudvarg, Marion wrote:
> Hello Michael,
> 
> I apologize if you're receiving this email a second time. I
> accidentally kept HTML formatting enabled the first time I sent it,
> causing it to be rejected as spam.
> 
> I am teaching the Operating Systems Organization course at Washington
> University in St. Louis, and to supplement a series of lectures on
> locking and synchronization, I assigned students to read the futex(7)
> manual page. One student, Alex Baker <mailto:alexbaker@wustl.edu>,
> pointed out a race condition in the description of how to "down" a
> futex, i.e. wait for or acquire a lock, under the Semantics section.
> Say we have two threads, T0 and T1, which execute as follows:
> 
> 1. T0 acquires the lock, decrements the futex to 0
> 2. T1 switches in, attempts to acquire the lock, decrements the futex
> to -1
> 3. T0 switches in, completes its critical section
> 4. T0 unlocks the lock, increments the futex to 0
> 5. Because the futex is 0, T0 assumes threads are waiting on the
> futex, but no threads have yet called FUTEX_WAIT
> 6. T0 sets the futex to 1
> 7. T0 uses the FUTEX_WAKE operation
> 8. T1 switches in, and believing it should still wait for the futex,
> it sets the futex to -1
> 9. T1 now uses the FUTEX_WAIT operation
> 
> Because, in step 8, T1 has set the futex to -1, its call to
> FUTEX_WAIT in 9 will succeed, as the futex holds the expected value
> in the call (-1). But since T0 has already completed its execution
> and has called FUTEX_WAKE, T1 may never be woken.
> 
> The fwait and fpost (i.e. lock and release, or down and up) functions
> in the Examples section on the futex(2) page seem to be race free,
> but use atomic compare exchange functions, instead of the atomic
> increment and decrement semantics described in futex(7).
> 
> I've attached a patch for the futex(7) man page, which modifies the
> Semantics section to describe a correct, race-free use of a futex for
> lock acquisition using atomic increment and decrement operations.
> I've also added a code sample to help illustrate this. I hope the
> addition of the code sample does not, in your opinion, add
> unnecessary length to this manual page, given that the futex(2) page
> is already so thorough.
> 
> I have copied Bert Hubert, whom I believe to be the original author
> of the futex(7) man page.

I have a question: what Linux distro are you using/did you make
this patch from? The code example that your patch is using is _not_
in the upstream page that is part of man-pages.

Regarding your comments on the race above, I find the text of
the manual page a bit unclear, so I'm not sure what kind of
fix should be made. Maybe we are lucky and Bert has a long memory
and replies to this thread.

Thanks,

Michael
 
=====

diff --git a/man7/futex.7 b/man7/futex.7
index 22f610646..f59725b61 100644
--- a/man7/futex.7
+++ b/man7/futex.7
@@ -72,19 +72,11 @@ operation.
 Waiting on a futex, to "down" it, is the reverse operation.
 Atomically decrement the counter and check if it changed to 0,
 in which case the operation is done and the futex was uncontended.
-In all other circumstances, the process should
-request that the kernel wait for another process to up the futex.
+In all other circumstances, the process should set the counter to \-1
+and request that the kernel wait for another process to up the futex.
 This is done using the
 .B FUTEX_WAIT
-operation,
-which is provided the return value of the atomic decrement operation.
-In the event that another process has modified the value of the futex
-between the atomic decrement and the
-.B FUTEX_WAIT
-operation, this guarantees that the
-.B FUTEX_WAIT
-fails, and the process may try again to "down" the futex.
-.SS 
+operation.
 .PP
 The
 .BR futex (2)
@@ -113,166 +105,6 @@ below.
 This man page illustrates the most common use of the
 .BR futex (2)
 primitives; it is by no means the only one.
-.SH EXAMPLES
-The program below demonstrates the use of futexes in a program where
-threads use a futex to synchronize access to a critical section,
-which increments a global integer variable
-.IR nloops
-(a command-line argument that defaults to 100000 if omitted)
-times.
-After the parallel section,
-the program prints the value of the global variable.
-Upon running this program we see output such as the following:
-.PP
-.in +4n
-.EX
-$ \fB./futex_demo\fP
-Ran with 2 threads
-Each thread incremented global_int 1000000 times
-Final value of global_int: 2000000
-.EE
-.in
-.SS Program source
-\&
-.EX
-/* futex_demo.c
-
-    Usage: futex_demo [nloops]
-                    (Default: 100000)
-
-    Demonstrate the use of futexes in a program where multiple threads
-    use a futex to synchronize access to a global integer variable, which
-    is initialized to 0. The two threads each increment the variable
-    \(aqnloops\(aq times, and employ a synchronization protocol that
-    ensures only one thread can access the global variable at a time.
-
-    We use OpenMP for thread parallelism;
-    therefore, this program must be compiled with the \-fopenmp flag,
-    e.g.:
-
-    gcc futex_demo.c \-o futex_demo \-fopenmp
-
-#define _GNU_SOURCE
-#include <stdio.h>
-#include <errno.h>
-#include <stdatomic.h>
-#include <stdint.h>
-#include <stdlib.h>
-#include <unistd.h>
-#include <sys/syscall.h>
-#include <linux/futex.h>
-#include <omp.h>
-
-#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \e
-                        } while (0)
-
-#define NUM_THREADS 2
-#define LOCKED 0
-#define UNLOCKED 1
-
-static int global_int = 0;
-static uint32_t lock = UNLOCKED;
-
-static int
-futex(uint32_t *uaddr, int futex_op, uint32_t val,
-      const struct timespec *timeout, uint32_t *uaddr2, uint32_t val3)
-{
-    return syscall(SYS_futex, uaddr, futex_op, val,
-                   timeout, uaddr2, val3);
-}
-
-/*  Increments the global integer variable nloops times.
-    Without locking, a race condition may occur. */
-
-static void
-critical_section(int nloops)
-{
-    for (int i = 0; i < nloops; i++) {
-        global_int++;
-    }
-}
-
-
-/*  Attempt to lock the futex pointed to by \(aqfutexp\(aq:
-    The futex value is decremented by 1.
-    If the futex value is now LOCKED,
-    the lock was successfully acquired.
-    Otherwise, wait for the lock to be released. */
-
-static void
-flock(uint32_t * futexp)
-{
-    int s;
-    int futex_val;
-
-    /* Attempt to acquire the lock */
-    while ( (futex_val = __atomic_sub_fetch(futexp, 1, __ATOMIC_ACQ_REL)) < LOCKED ) {
-
-        /* If the lock is not available, wait */
-
-        s = futex(futexp, FUTEX_WAIT, futex_val, NULL, NULL, 0);
-        if (s == \-1 && errno != EAGAIN)
-            errExit("futex\-FUTEX_WAIT");
-    }
-}
-
-/*  Unlock the futex pointed to by \(aqfutexp\(aq:
-    The futex value is incremented by 1.
-    
-    If the futex value is now UNLOCKED,
-    no threads are waiting for the lock.
-    Otherwise, another thread is waiting,
-    so set the value to UNLOCKED and wake. */
-
-static void
-funlock(uint32_t * futexp)
-{
-    int s;
-
-    /* Are any threads waiting for the lock? */
-    if (__atomic_add_fetch(futexp, 1, __ATOMIC_ACQ_REL) != UNLOCKED) {
-
-        /* If so, unlock and notify */
-        __atomic_store_n(futexp, UNLOCKED, __ATOMIC_RELEASE);
-        s = futex(futexp, FUTEX_WAKE, 1, NULL, NULL, 0);
-        if (s  == \-1)
-            errExit("futex\-FUTEX_WAKE");
-    }
-}
-
-int
-main(int argc, char *argv[])
-{
-    int nloops;
-    int n_threads;
-
-    nloops = (argc > 1) ? atoi(argv[1]) : 100000;
-
-    //Begin OpenMP parallel section
-    omp_set_num_threads(NUM_THREADS);
-    #pragma omp parallel
-    {
-
-        //Retrieve the actual number of threads
-        if (omp_get_thread_num() == 0) {
-            n_threads = omp_get_num_threads();
-        }
-
-        //Lock and run the critical section
-        flock(&lock);
-        critical_section(nloops);
-        funlock(&lock);
-
-    }
-    
-    printf("Ran with %d threads\n", n_threads);
-    printf("Each thread incremented global_int %d times\n", nloops);
-    printf("Final value of global_int: %d\n", global_int);
-
-    exit(EXIT_SUCCESS);
-}
-
-.EE
 .\" .SH AUTHORS
 .\" .PP
 .\" Futexes were designed and worked on by Hubertus Franke
-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

[-- Attachment #2: futex.7.patch --]
[-- Type: application/octet-stream, Size: 5495 bytes --]

diff --git a/man7/futex.7 b/man7/futex.7
index f59725b61..22f610646 100644
--- a/man7/futex.7
+++ b/man7/futex.7
@@ -72,11 +72,19 @@ operation.
 Waiting on a futex, to "down" it, is the reverse operation.
 Atomically decrement the counter and check if it changed to 0,
 in which case the operation is done and the futex was uncontended.
-In all other circumstances, the process should set the counter to \-1
-and request that the kernel wait for another process to up the futex.
+In all other circumstances, the process should
+request that the kernel wait for another process to up the futex.
 This is done using the
 .B FUTEX_WAIT
-operation.
+operation,
+which is provided the return value of the atomic decrement operation.
+In the event that another process has modified the value of the futex
+between the atomic decrement and the
+.B FUTEX_WAIT
+operation, this guarantees that the
+.B FUTEX_WAIT
+fails, and the process may try again to "down" the futex.
+.SS 
 .PP
 The
 .BR futex (2)
@@ -105,6 +113,166 @@ below.
 This man page illustrates the most common use of the
 .BR futex (2)
 primitives; it is by no means the only one.
+.SH EXAMPLES
+The program below demonstrates the use of futexes in a program where
+threads use a futex to synchronize access to a critical section,
+which increments a global integer variable
+.IR nloops
+(a command-line argument that defaults to 100000 if omitted)
+times.
+After the parallel section,
+the program prints the value of the global variable.
+Upon running this program we see output such as the following:
+.PP
+.in +4n
+.EX
+$ \fB./futex_demo\fP
+Ran with 2 threads
+Each thread incremented global_int 1000000 times
+Final value of global_int: 2000000
+.EE
+.in
+.SS Program source
+\&
+.EX
+/* futex_demo.c
+
+    Usage: futex_demo [nloops]
+                    (Default: 100000)
+
+    Demonstrate the use of futexes in a program where multiple threads
+    use a futex to synchronize access to a global integer variable, which
+    is initialized to 0. The two threads each increment the variable
+    \(aqnloops\(aq times, and employ a synchronization protocol that
+    ensures only one thread can access the global variable at a time.
+
+    We use OpenMP for thread parallelism;
+    therefore, this program must be compiled with the \-fopenmp flag,
+    e.g.:
+
+    gcc futex_demo.c \-o futex_demo \-fopenmp
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <errno.h>
+#include <stdatomic.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <sys/syscall.h>
+#include <linux/futex.h>
+#include <omp.h>
+
+#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \e
+                        } while (0)
+
+#define NUM_THREADS 2
+#define LOCKED 0
+#define UNLOCKED 1
+
+static int global_int = 0;
+static uint32_t lock = UNLOCKED;
+
+static int
+futex(uint32_t *uaddr, int futex_op, uint32_t val,
+      const struct timespec *timeout, uint32_t *uaddr2, uint32_t val3)
+{
+    return syscall(SYS_futex, uaddr, futex_op, val,
+                   timeout, uaddr2, val3);
+}
+
+/*  Increments the global integer variable nloops times.
+    Without locking, a race condition may occur. */
+
+static void
+critical_section(int nloops)
+{
+    for (int i = 0; i < nloops; i++) {
+        global_int++;
+    }
+}
+
+
+/*  Attempt to lock the futex pointed to by \(aqfutexp\(aq:
+    The futex value is decremented by 1.
+    If the futex value is now LOCKED,
+    the lock was successfully acquired.
+    Otherwise, wait for the lock to be released. */
+
+static void
+flock(uint32_t * futexp)
+{
+    int s;
+    int futex_val;
+
+    /* Attempt to acquire the lock */
+    while ( (futex_val = __atomic_sub_fetch(futexp, 1, __ATOMIC_ACQ_REL)) < LOCKED ) {
+
+        /* If the lock is not available, wait */
+
+        s = futex(futexp, FUTEX_WAIT, futex_val, NULL, NULL, 0);
+        if (s == \-1 && errno != EAGAIN)
+            errExit("futex\-FUTEX_WAIT");
+    }
+}
+
+/*  Unlock the futex pointed to by \(aqfutexp\(aq:
+    The futex value is incremented by 1.
+    
+    If the futex value is now UNLOCKED,
+    no threads are waiting for the lock.
+    Otherwise, another thread is waiting,
+    so set the value to UNLOCKED and wake. */
+
+static void
+funlock(uint32_t * futexp)
+{
+    int s;
+
+    /* Are any threads waiting for the lock? */
+    if (__atomic_add_fetch(futexp, 1, __ATOMIC_ACQ_REL) != UNLOCKED) {
+
+        /* If so, unlock and notify */
+        __atomic_store_n(futexp, UNLOCKED, __ATOMIC_RELEASE);
+        s = futex(futexp, FUTEX_WAKE, 1, NULL, NULL, 0);
+        if (s  == \-1)
+            errExit("futex\-FUTEX_WAKE");
+    }
+}
+
+int
+main(int argc, char *argv[])
+{
+    int nloops;
+    int n_threads;
+
+    nloops = (argc > 1) ? atoi(argv[1]) : 100000;
+
+    //Begin OpenMP parallel section
+    omp_set_num_threads(NUM_THREADS);
+    #pragma omp parallel
+    {
+
+        //Retrieve the actual number of threads
+        if (omp_get_thread_num() == 0) {
+            n_threads = omp_get_num_threads();
+        }
+
+        //Lock and run the critical section
+        flock(&lock);
+        critical_section(nloops);
+        funlock(&lock);
+
+    }
+    
+    printf("Ran with %d threads\n", n_threads);
+    printf("Each thread incremented global_int %d times\n", nloops);
+    printf("Final value of global_int: %d\n", global_int);
+
+    exit(EXIT_SUCCESS);
+}
+
+.EE
 .\" .SH AUTHORS
 .\" .PP
 .\" Futexes were designed and worked on by Hubertus Franke

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* FW: [patch] futex.7: Semantics section: Race condition in locking semantics description
  2020-12-02 21:04     ` Sudvarg, Marion
@ 2021-02-21 23:02       ` Sudvarg, Marion
  0 siblings, 0 replies; 4+ messages in thread
From: Sudvarg, Marion @ 2021-02-21 23:02 UTC (permalink / raw)
  To: mtk.manpages; +Cc: linux-man, bert, Gill, Christopher

[-- Attachment #1: Type: text/plain, Size: 2418 bytes --]

Hello Michael,

I hope you've been well. I'm writing to follow up with you regarding our proposed changes to the futex.7 man-pages entry. Last semester, I taught the Operating Systems Organization course at Washington University in St. Louis, and to supplement a series of lectures on locking and synchronization, I assigned students to read the futex(7) manual page. One student, Alex Baker <alexbaker@wustl.edu>, pointed out a race condition in the description of how to "down" a futex, i.e. wait for or acquire a lock, under the Semantics section. Say we have two threads, T0 and T1, which execute as follows:

1. T0 acquires the lock, decrements the futex to 0
2. T1 switches in, attempts to acquire the lock, decrements the futex to -1
3. T0 switches in, completes its critical section
4. T0 unlocks the lock, increments the futex to 0
5. Because the futex is 0, T0 assumes threads are waiting on the futex, but no threads have yet called FUTEX_WAIT
6. T0 sets the futex to 1
7. T0 uses the FUTEX_WAKE operation
8. T1 switches in, and believing it should still wait for the futex, it sets the futex to -1
9. T1 now uses the FUTEX_WAIT operation

Because, in step 8, T1 has set the futex to -1, its call to FUTEX_WAIT in 9 will succeed, as the futex holds the expected value in the call (-1). But since T0 has already completed its execution and has called FUTEX_WAKE, T1 may never be woken.

The fwait and fpost (i.e. lock and release, or down and up) functions in the Examples section on the futex(2) page seem to be race free, but use atomic compare exchange functions, instead of the atomic increment and decrement semantics described in futex(7). 

I've attached a patch for the futex(7) man page, which modifies the Semantics section to describe a correct, race-free use of a futex for lock acquisition using atomic increment and decrement operations. I've also added a code sample to help illustrate this. I hope the addition of the code sample does not, in your opinion, add unnecessary length to this manual page, given that the futex(2) page is already so thorough.

I have again copied Bert Hubert, whom I believe to be the original author of the futex(7) man page. I have also copied Chris Gill, who is my adviser and oversees the operating system courses at Washington University.

Thank you,

Marion Sudvarg
PhD Student, Computer Science
Washington University in St. Louis

[-- Attachment #2: futex.7.patch --]
[-- Type: application/octet-stream, Size: 5495 bytes --]

diff --git a/man7/futex.7 b/man7/futex.7
index f59725b61..22f610646 100644
--- a/man7/futex.7
+++ b/man7/futex.7
@@ -72,11 +72,19 @@ operation.
 Waiting on a futex, to "down" it, is the reverse operation.
 Atomically decrement the counter and check if it changed to 0,
 in which case the operation is done and the futex was uncontended.
-In all other circumstances, the process should set the counter to \-1
-and request that the kernel wait for another process to up the futex.
+In all other circumstances, the process should
+request that the kernel wait for another process to up the futex.
 This is done using the
 .B FUTEX_WAIT
-operation.
+operation,
+which is provided the return value of the atomic decrement operation.
+In the event that another process has modified the value of the futex
+between the atomic decrement and the
+.B FUTEX_WAIT
+operation, this guarantees that the
+.B FUTEX_WAIT
+fails, and the process may try again to "down" the futex.
+.SS 
 .PP
 The
 .BR futex (2)
@@ -105,6 +113,166 @@ below.
 This man page illustrates the most common use of the
 .BR futex (2)
 primitives; it is by no means the only one.
+.SH EXAMPLES
+The program below demonstrates the use of futexes in a program where
+threads use a futex to synchronize access to a critical section,
+which increments a global integer variable
+.IR nloops
+(a command-line argument that defaults to 100000 if omitted)
+times.
+After the parallel section,
+the program prints the value of the global variable.
+Upon running this program we see output such as the following:
+.PP
+.in +4n
+.EX
+$ \fB./futex_demo\fP
+Ran with 2 threads
+Each thread incremented global_int 1000000 times
+Final value of global_int: 2000000
+.EE
+.in
+.SS Program source
+\&
+.EX
+/* futex_demo.c
+
+    Usage: futex_demo [nloops]
+                    (Default: 100000)
+
+    Demonstrate the use of futexes in a program where multiple threads
+    use a futex to synchronize access to a global integer variable, which
+    is initialized to 0. The two threads each increment the variable
+    \(aqnloops\(aq times, and employ a synchronization protocol that
+    ensures only one thread can access the global variable at a time.
+
+    We use OpenMP for thread parallelism;
+    therefore, this program must be compiled with the \-fopenmp flag,
+    e.g.:
+
+    gcc futex_demo.c \-o futex_demo \-fopenmp
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <errno.h>
+#include <stdatomic.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <sys/syscall.h>
+#include <linux/futex.h>
+#include <omp.h>
+
+#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \e
+                        } while (0)
+
+#define NUM_THREADS 2
+#define LOCKED 0
+#define UNLOCKED 1
+
+static int global_int = 0;
+static uint32_t lock = UNLOCKED;
+
+static int
+futex(uint32_t *uaddr, int futex_op, uint32_t val,
+      const struct timespec *timeout, uint32_t *uaddr2, uint32_t val3)
+{
+    return syscall(SYS_futex, uaddr, futex_op, val,
+                   timeout, uaddr2, val3);
+}
+
+/*  Increments the global integer variable nloops times.
+    Without locking, a race condition may occur. */
+
+static void
+critical_section(int nloops)
+{
+    for (int i = 0; i < nloops; i++) {
+        global_int++;
+    }
+}
+
+
+/*  Attempt to lock the futex pointed to by \(aqfutexp\(aq:
+    The futex value is decremented by 1.
+    If the futex value is now LOCKED,
+    the lock was successfully acquired.
+    Otherwise, wait for the lock to be released. */
+
+static void
+flock(uint32_t * futexp)
+{
+    int s;
+    int futex_val;
+
+    /* Attempt to acquire the lock */
+    while ( (futex_val = __atomic_sub_fetch(futexp, 1, __ATOMIC_ACQ_REL)) < LOCKED ) {
+
+        /* If the lock is not available, wait */
+
+        s = futex(futexp, FUTEX_WAIT, futex_val, NULL, NULL, 0);
+        if (s == \-1 && errno != EAGAIN)
+            errExit("futex\-FUTEX_WAIT");
+    }
+}
+
+/*  Unlock the futex pointed to by \(aqfutexp\(aq:
+    The futex value is incremented by 1.
+    
+    If the futex value is now UNLOCKED,
+    no threads are waiting for the lock.
+    Otherwise, another thread is waiting,
+    so set the value to UNLOCKED and wake. */
+
+static void
+funlock(uint32_t * futexp)
+{
+    int s;
+
+    /* Are any threads waiting for the lock? */
+    if (__atomic_add_fetch(futexp, 1, __ATOMIC_ACQ_REL) != UNLOCKED) {
+
+        /* If so, unlock and notify */
+        __atomic_store_n(futexp, UNLOCKED, __ATOMIC_RELEASE);
+        s = futex(futexp, FUTEX_WAKE, 1, NULL, NULL, 0);
+        if (s  == \-1)
+            errExit("futex\-FUTEX_WAKE");
+    }
+}
+
+int
+main(int argc, char *argv[])
+{
+    int nloops;
+    int n_threads;
+
+    nloops = (argc > 1) ? atoi(argv[1]) : 100000;
+
+    //Begin OpenMP parallel section
+    omp_set_num_threads(NUM_THREADS);
+    #pragma omp parallel
+    {
+
+        //Retrieve the actual number of threads
+        if (omp_get_thread_num() == 0) {
+            n_threads = omp_get_num_threads();
+        }
+
+        //Lock and run the critical section
+        flock(&lock);
+        critical_section(nloops);
+        funlock(&lock);
+
+    }
+    
+    printf("Ran with %d threads\n", n_threads);
+    printf("Each thread incremented global_int %d times\n", nloops);
+    printf("Final value of global_int: %d\n", global_int);
+
+    exit(EXIT_SUCCESS);
+}
+
+.EE
 .\" .SH AUTHORS
 .\" .PP
 .\" Futexes were designed and worked on by Hubertus Franke

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-02-21 23:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <DM5PR02MB3687609B599F7773193DE31AC4F30@DM5PR02MB3687.namprd02.prod.outlook.com>
2020-12-02 18:07 ` [patch] futex.7: Semantics section: Race condition in locking semantics description Sudvarg, Marion
2020-12-02 20:47   ` Michael Kerrisk (man-pages)
2020-12-02 21:04     ` Sudvarg, Marion
2021-02-21 23:02       ` FW: " Sudvarg, Marion

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.