linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] 4.4.262: infinite loop in futex_unlock_pi (EAGAIN loop)
@ 2021-07-19 16:24 Joe Korty
  2021-07-22 14:11 ` Greg Kroah-Hartman
  0 siblings, 1 reply; 6+ messages in thread
From: Joe Korty @ 2021-07-19 16:24 UTC (permalink / raw)
  To: Peter Zijlstra, Lee Jones
  Cc: Steven Rostedt, Thomas Gleixner, Sebastian Andrzej Siewior,
	Greg Kroah-Hartman, LKML

[-- Attachment #1: Type: text/plain, Size: 2125 bytes --]

[BUG] 4.4.262: infinite loop in futex_unlock_pi (EAGAIN loop)

   [ replicator, attached ]
   [ workaround patch that crudely clears the loop, attached ]
   [ 4.4.256 does _not_ have this problem, 4.4.262 is known to have it ]

When a certain, secure-site application is run on 4.4.262, it locks up and
is unkillable.  Crash(8) and sysrq backtraces show that the application
is looping in the kernel in futex_unlock_pi.

Between 4.4.256 and .257, 4.4 got this 4.12 patch backported into it:

   73d786b ("[PATCH] futex: Rework inconsistent rt_mutex/futex_q state")

This patch has the following comment:

   The only problem is that this breaks RT timeliness guarantees. That
   is, consider the following scenario:

      T1 and T2 are both pinned to CPU0. prio(T2) > prio(T1)

        CPU0

        T1
          lock_pi()
          queue_me()  <- Waiter is visible
   
        preemption

        T2
          unlock_pi()
            loops with -EAGAIN forever

    Which is undesirable for PI primitives. Future patches will rectify
    this.

This describes the situation exactly.  To prove, we developed a little
kernel patch that, on loop detection, puts a message into the kernel log for
just the first occurrence, keeps a count of the number of occurrences seen
since boot, and tries to break out of the loop via usleep_range(1000,1000).
Note that the patch is not really needed for replication.  It merely shows,
by 'fixing' the problem, that it really is the EAGAIN loop that triggers
the lockup.

Along with this patch, we submit a replicator.  Running this replicator
with this patch, it can be seen that 4.4.256 does not have the problem.
4.4.267 and the latest 4.4, 4.4.275, do.  In addition, 4.9.274 (tested
w/o the patch) does not have the problem.

From this pattern there may be some futex fixup patch that was ported
back into 4.9 but failed to make it to 4.4.

Acknowledgements: My colleague, Scott Shaffer, performed the crash/sysrq
analysis that found the futex_unlock_pi loop, and he raised the suspicion
that commit 73d786b might be the cause.

Signed-off-by: Joe Korty <joe.korty@concurrent-rt.com>


[-- Attachment #2: futex-unlock-pi-eagain-hack --]
[-- Type: text/plain, Size: 3153 bytes --]

Fix infinite 'EAGAIN' loop in futex_unlock_pi (Raytheon bug).

  [ 4.4.256-RedHawk is known _not_ to have this problem ]
  [ 4.4.262-RedHawk is known to have it ]
  [ This is unlikely to be caused by the rt patch ]
  [ or by our patches ]

A customer with a rather large, secure-site application
occasionally stalls indefinately, spinning in
futex_unlock_pi.  This cannot be cleared except by
rebooting.

New to 4.4 is 73d786b ("[PATCH] futex: Rework inconsistent
rt_mutex/futex_q state").  This adds an EAGAIN loop which
appears to be the cause of the problem.  That commit has
this comment:

   The only problem is that this breaks RT timeliness guarantees. That
   is, consider the following scenario:

      T1 and T2 are both pinned to CPU0. prio(T2) > prio(T1)

        CPU0

        T1
          lock_pi()
          queue_me()  <- Waiter is visible
    
        preemption

        T2
          unlock_pi()
            loops with -EAGAIN forever

    Which is undesirable for PI primitives. Future patches will rectify
    this.

The above matches the behavor exactly.  Our replicator,
based on posix-thread/pthread_mutex_lock/1-1.c test shows
that indeed this is the problem.

We fix, for now, by introducing a one millisecond sleep
in the looping code.  This gives the lower priority task
T1 the time it needs to clean up the condition that the
higher priority task T2 is spin-waiting on.

This patch is temporary until mainline linux fixes this.
We suspect that some futex patch from 4.12 that actually
fixes this failed to be backported.

Problem-Analysed-by: Scott Shaffer while at concurrent-rt.com
Developed-by: Joe Korty while at concurrent-rt.com

Index: b/kernel/futex.c
===================================================================
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -66,6 +66,7 @@
 #include <linux/freezer.h>
 #include <linux/bootmem.h>
 #include <linux/fault-inject.h>
+#include <linux/delay.h>
 
 #include <asm/futex.h>
 
@@ -1529,8 +1530,10 @@ static void mark_wake_futex(struct wake_
 	q->lock_ptr = NULL;
 }
 
+unsigned long futex_eagain_ctr = 0;
+
 static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q *this,
-			 struct futex_hash_bucket *hb)
+			 struct futex_hash_bucket *hb, int *dologp)
 {
 	struct task_struct *new_owner;
 	struct futex_pi_state *pi_state = this->pi_state;
@@ -1564,6 +1567,7 @@ static int wake_futex_pi(u32 __user *uad
 	 */
 	if (!new_owner) {
 		raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
+		*dologp = 1;
 		return -EAGAIN;
 	}
 
@@ -2967,7 +2971,8 @@ retry:
 	 */
 	match = futex_top_waiter(hb, &key);
 	if (match) {
-		ret = wake_futex_pi(uaddr, uval, match, hb);
+		int log_eagain = 0;
+		ret = wake_futex_pi(uaddr, uval, match, hb, &log_eagain);
 		/*
 		 * In case of success wake_futex_pi dropped the hash
 		 * bucket lock.
@@ -2987,6 +2992,11 @@ retry:
 		if (ret == -EAGAIN) {
 			spin_unlock(&hb->lock);
 			put_futex_key(&key);
+			if (log_eagain) {
+				printk_once("WARNING: futex_unlock_pi: in EAGAIN loop\n");
+				futex_eagain_ctr++;
+				usleep_range(1000, 1000); /* let the guy run who will clean things up */
+			}
 			goto retry;
 		}
 		/*

[-- Attachment #3: 1-1.c --]
[-- Type: text/plain, Size: 3997 bytes --]

/*   
 * LTP project test pthread_mutex_lock/1-1.c, modified to trigger
 * the futex_unlock_pi EAGAIN loopstall, on those kernels having
 * this bug.
 *
 * To build: cc -o 1-1 1-1.c -pthread
 * To run: taskset 4 ./1-1
 * Result: if helper kernel patch is installed, one will get this
 * in the kernel log: 'WARNING: futex_unlock_pi: in EAGAIN loop'.
 * This indicates that the EAGAIN loop was seen and broken out of with
 * a usleep_range(1000,1000).  If the helper kernel patch is not present,
 * 1-1 loops forever in futex_unlock_pi, and is unkillable.
 *
 *
 *
 * Copyright (c) 2002, Intel Corporation. All rights reserved.
 * Created by:  bing.wei.liu REMOVE-THIS AT intel DOT com
 * This file is licensed under the GPL license.  For the full content
 * of this license, see the COPYING file at the top level of this 
 * source tree.

 * Test that pthread_mutex_lock()
 *   shall lock the mutex object referenced by 'mutex'.  If the mutex is
 *   already locked, the calling thread shall block until the mutex becomes
 *   available. This operation shall return with the mutex object referenced
 *   by 'mutex' in the locked state with the calling thread as its owner.

 * Steps: 
 *   -- Initialize a mutex to protect a global variable 'value'
 *   -- Create N threads. Each is looped M times to acquire the mutex, 
 *      increase the value, and then release the mutex.
 *   -- Check if the value has increased properly (M*N); a broken mutex 
 *      implementation may cause lost augments.
 *
 */

#define _XOPEN_SOURCE 600

#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
#include <sched.h>
#include "posixtest.h"

#define    THREAD_NUM  	15
#define    LOOPS     	40000

static void setpri(int priority)
{
	struct sched_param parm;
	parm.sched_priority = priority;
	sched_setscheduler(0, SCHED_FIFO, &parm);
}

void *f1(void *parm);

pthread_mutex_t    mutex = PTHREAD_MUTEX_INITIALIZER;
int                value;	/* value protected by mutex */

int main(void)
{
  	int                   i, rc __attribute__((unused));
  	pthread_attr_t        pta;
	pthread_mutexattr_t   mta;
  	pthread_t             threads[THREAD_NUM];

  	pthread_attr_init(&pta);
  	pthread_attr_setdetachstate(&pta, PTHREAD_CREATE_JOINABLE);

	pthread_mutexattr_init(&mta);
	pthread_mutexattr_setprotocol(&mta, PTHREAD_PRIO_INHERIT);
	pthread_mutex_init(&mutex, &mta);
  
  	/* Create threads */
  	fprintf(stderr,"Creating %d threads\n", THREAD_NUM);
  	fprintf(stderr,"Executing %d loops\n", LOOPS);
  	for (i=0; i<THREAD_NUM; ++i)
    		rc = pthread_create(&threads[i], &pta, f1, NULL);

	/* Wait to join all threads */
  	for (i=0; i<THREAD_NUM; ++i)
    		pthread_join(threads[i], NULL);
  	pthread_attr_destroy(&pta);
  	pthread_mutex_destroy(&mutex);
  
  	/* Check if the final value is as expected */
  	if(value != (THREAD_NUM) * LOOPS) {
	  	fprintf(stderr,"Using %d threads and each loops %d times\n", THREAD_NUM, LOOPS);
    		fprintf(stderr,"Final value must be %d instead of %d\n", (THREAD_NUM)*LOOPS, value);
		printf("Test FAILED\n");
		return PTS_FAIL;
  	}
	
	printf("Test PASSED\n");
	return PTS_PASS;
}

void *f1(void *parm)
{
  	int   i, tmp;
  	int   rc = 0;

	static int next = 1;
	int thread_id = next++;
	int priority = thread_id + 4;
	setpri(priority);

	/* Loopd M times to acquire the mutex, increase the value, 
	   and then release the mutex. */
	   
  	for (i=0; i<LOOPS; ++i) {
      		rc = pthread_mutex_lock(&mutex);
      		if(rc!=0) {
        		fprintf(stderr,"Error on pthread_mutex_lock(), rc=%d\n", rc);
			return (void*)(PTS_FAIL);
      		}

    		tmp = value;
    		tmp = tmp+1;
    		// fprintf(stderr,"%2d: ", thread_id);
    		usleep((i%150)+9);	  /* delay the increasement operation */
    		value = tmp;

      		rc = pthread_mutex_unlock(&mutex);
      		if(rc!=0) {
        		fprintf(stderr,"Error on pthread_mutex_unlock(), rc=%d\n", rc);
 			return (void*)(PTS_UNRESOLVED);
      		}
    		usleep(67);
  	}
  	pthread_exit(0);
  	return (void*)(0);
}

[-- Attachment #4: posixtest.h --]
[-- Type: text/plain, Size: 464 bytes --]

/*
 * Copyright (c) 2002, Intel Corporation. All rights reserved.
 * Created by:  julie.n.fleischer REMOVE-THIS AT intel DOT com
 * This file is licensed under the GPL license.  For the full content
 * of this license, see the COPYING file at the top level of this
 * source tree.
 */

/*
 * return codes
 */
#define PTS_PASS        0
#define PTS_FAIL        1
#define PTS_UNRESOLVED  2
#define PTS_UNSUPPORTED 0 /* was 4 */
#define PTS_UNTESTED    0 /* was 5 */


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BUG] 4.4.262: infinite loop in futex_unlock_pi (EAGAIN loop)
  2021-07-19 16:24 [BUG] 4.4.262: infinite loop in futex_unlock_pi (EAGAIN loop) Joe Korty
@ 2021-07-22 14:11 ` Greg Kroah-Hartman
  2021-07-27 22:19   ` Joe Korty
  0 siblings, 1 reply; 6+ messages in thread
From: Greg Kroah-Hartman @ 2021-07-22 14:11 UTC (permalink / raw)
  To: Joe Korty
  Cc: Peter Zijlstra, Lee Jones, Steven Rostedt, Thomas Gleixner,
	Sebastian Andrzej Siewior, LKML

On Mon, Jul 19, 2021 at 12:24:18PM -0400, Joe Korty wrote:
> [BUG] 4.4.262: infinite loop in futex_unlock_pi (EAGAIN loop)
> 
>    [ replicator, attached ]
>    [ workaround patch that crudely clears the loop, attached ]
>    [ 4.4.256 does _not_ have this problem, 4.4.262 is known to have it ]
> 
> When a certain, secure-site application is run on 4.4.262, it locks up and
> is unkillable.  Crash(8) and sysrq backtraces show that the application
> is looping in the kernel in futex_unlock_pi.
> 
> Between 4.4.256 and .257, 4.4 got this 4.12 patch backported into it:
> 
>    73d786b ("[PATCH] futex: Rework inconsistent rt_mutex/futex_q state")
> 
> This patch has the following comment:
> 
>    The only problem is that this breaks RT timeliness guarantees. That
>    is, consider the following scenario:
> 
>       T1 and T2 are both pinned to CPU0. prio(T2) > prio(T1)
> 
>         CPU0
> 
>         T1
>           lock_pi()
>           queue_me()  <- Waiter is visible
>    
>         preemption
> 
>         T2
>           unlock_pi()
>             loops with -EAGAIN forever
> 
>     Which is undesirable for PI primitives. Future patches will rectify
>     this.
> 
> This describes the situation exactly.  To prove, we developed a little
> kernel patch that, on loop detection, puts a message into the kernel log for
> just the first occurrence, keeps a count of the number of occurrences seen
> since boot, and tries to break out of the loop via usleep_range(1000,1000).
> Note that the patch is not really needed for replication.  It merely shows,
> by 'fixing' the problem, that it really is the EAGAIN loop that triggers
> the lockup.
> 
> Along with this patch, we submit a replicator.  Running this replicator
> with this patch, it can be seen that 4.4.256 does not have the problem.
> 4.4.267 and the latest 4.4, 4.4.275, do.  In addition, 4.9.274 (tested
> w/o the patch) does not have the problem.
> 
> >From this pattern there may be some futex fixup patch that was ported
> back into 4.9 but failed to make it to 4.4.

Odd, I can't seem to find anything that we missed.  Can you dig to see
if there is something that we need to do here so we can resolve this?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BUG] 4.4.262: infinite loop in futex_unlock_pi (EAGAIN loop)
  2021-07-22 14:11 ` Greg Kroah-Hartman
@ 2021-07-27 22:19   ` Joe Korty
  2021-07-28  6:07     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 6+ messages in thread
From: Joe Korty @ 2021-07-27 22:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Peter Zijlstra, Lee Jones, Steven Rostedt, Thomas Gleixner,
	Sebastian Andrzej Siewior, LKML, Ingo Molnar, Darren Hart,
	Davidlohr Bueso


 [ Added missing people to the cc: as listed in MAINTAINERS ]

On Thu, Jul 22, 2021 at 04:11:41PM +0200, Greg Kroah-Hartman wrote:
> On Mon, Jul 19, 2021 at 12:24:18PM -0400, Joe Korty wrote:
> > [BUG] 4.4.262: infinite loop in futex_unlock_pi (EAGAIN loop)
> > 
> >    [ replicator, attached ]
> >    [ workaround patch that crudely clears the loop, attached ]
> >    [ 4.4.256 does _not_ have this problem, 4.4.262 is known to have it ]
> > 
> > When a certain, secure-site application is run on 4.4.262, it locks up and
> > is unkillable.  Crash(8) and sysrq backtraces show that the application
> > is looping in the kernel in futex_unlock_pi.
> > 
> > Between 4.4.256 and .257, 4.4 got this 4.12 patch backported into it:
> > 
> >    73d786b ("[PATCH] futex: Rework inconsistent rt_mutex/futex_q state")
> > 
> > This patch has the following comment:
> > 
> >    The only problem is that this breaks RT timeliness guarantees. That
> >    is, consider the following scenario:
> > 
> >       T1 and T2 are both pinned to CPU0. prio(T2) > prio(T1)
> > 
> >         CPU0
> > 
> >         T1
> >           lock_pi()
> >           queue_me()  <- Waiter is visible
> >    
> >         preemption
> > 
> >         T2
> >           unlock_pi()
> >             loops with -EAGAIN forever
> > 
> >     Which is undesirable for PI primitives. Future patches will rectify
> >     this.
> > 
> > This describes the situation exactly.  To prove, we developed a little
> > kernel patch that, on loop detection, puts a message into the kernel log for
> > just the first occurrence, keeps a count of the number of occurrences seen
> > since boot, and tries to break out of the loop via usleep_range(1000,1000).
> > Note that the patch is not really needed for replication.  It merely shows,
> > by 'fixing' the problem, that it really is the EAGAIN loop that triggers
> > the lockup.
> > 
> > Along with this patch, we submit a replicator.  Running this replicator
> > with this patch, it can be seen that 4.4.256 does not have the problem.
> > 4.4.267 and the latest 4.4, 4.4.275, do.  In addition, 4.9.274 (tested
> > w/o the patch) does not have the problem.
> > 
> > >From this pattern there may be some futex fixup patch that was ported
> > back into 4.9 but failed to make it to 4.4.
> 
> Odd, I can't seem to find anything that we missed.  Can you dig to see
> if there is something that we need to do here so we can resolve this?
> 
> thanks,
> greg k-h


Hi Greg,

4.12 has these apparently-original patches:

  73d786b  futex: Rework inconsistent rt_mutex/futex_q state
  cfafcd1  futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()

I have verified that the first commit, 73d786b, introduces
the futex_unlock_pi infinite loop bug into 4.12.  I have
also verified that the last commit, cfafcd1, fixes the bug.

4.9 has had both futex patches backported into it.
Verified that 4.9.276 does not suffer from the bug.

4.4 has had the first patch backported, as 394fc49, but
not the last.  I have verified that building a kernel at
394fc49 and at v4.4.276, the bug is seen, and at 394fc49^,
the bug is not present.

The missing commit, cfafcd1 in 4.12, is present in 4.9
as 13c98b0.  A visual spot-check of 13c98b0, as a patch,
with kernel/futex.c in 4.4.276 did not find any hunks of
13c98b0 present in 4.4.276's kernel/futex.c.

Regards,
Joe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BUG] 4.4.262: infinite loop in futex_unlock_pi (EAGAIN loop)
  2021-07-27 22:19   ` Joe Korty
@ 2021-07-28  6:07     ` Greg Kroah-Hartman
  2021-07-28 13:51       ` Joe Korty
  0 siblings, 1 reply; 6+ messages in thread
From: Greg Kroah-Hartman @ 2021-07-28  6:07 UTC (permalink / raw)
  To: Joe Korty
  Cc: Peter Zijlstra, Lee Jones, Steven Rostedt, Thomas Gleixner,
	Sebastian Andrzej Siewior, LKML, Ingo Molnar, Darren Hart,
	Davidlohr Bueso

On Tue, Jul 27, 2021 at 06:19:50PM -0400, Joe Korty wrote:
> 
>  [ Added missing people to the cc: as listed in MAINTAINERS ]
> 
> On Thu, Jul 22, 2021 at 04:11:41PM +0200, Greg Kroah-Hartman wrote:
> > On Mon, Jul 19, 2021 at 12:24:18PM -0400, Joe Korty wrote:
> > > [BUG] 4.4.262: infinite loop in futex_unlock_pi (EAGAIN loop)
> > > 
> > >    [ replicator, attached ]
> > >    [ workaround patch that crudely clears the loop, attached ]
> > >    [ 4.4.256 does _not_ have this problem, 4.4.262 is known to have it ]
> > > 
> > > When a certain, secure-site application is run on 4.4.262, it locks up and
> > > is unkillable.  Crash(8) and sysrq backtraces show that the application
> > > is looping in the kernel in futex_unlock_pi.
> > > 
> > > Between 4.4.256 and .257, 4.4 got this 4.12 patch backported into it:
> > > 
> > >    73d786b ("[PATCH] futex: Rework inconsistent rt_mutex/futex_q state")
> > > 
> > > This patch has the following comment:
> > > 
> > >    The only problem is that this breaks RT timeliness guarantees. That
> > >    is, consider the following scenario:
> > > 
> > >       T1 and T2 are both pinned to CPU0. prio(T2) > prio(T1)
> > > 
> > >         CPU0
> > > 
> > >         T1
> > >           lock_pi()
> > >           queue_me()  <- Waiter is visible
> > >    
> > >         preemption
> > > 
> > >         T2
> > >           unlock_pi()
> > >             loops with -EAGAIN forever
> > > 
> > >     Which is undesirable for PI primitives. Future patches will rectify
> > >     this.
> > > 
> > > This describes the situation exactly.  To prove, we developed a little
> > > kernel patch that, on loop detection, puts a message into the kernel log for
> > > just the first occurrence, keeps a count of the number of occurrences seen
> > > since boot, and tries to break out of the loop via usleep_range(1000,1000).
> > > Note that the patch is not really needed for replication.  It merely shows,
> > > by 'fixing' the problem, that it really is the EAGAIN loop that triggers
> > > the lockup.
> > > 
> > > Along with this patch, we submit a replicator.  Running this replicator
> > > with this patch, it can be seen that 4.4.256 does not have the problem.
> > > 4.4.267 and the latest 4.4, 4.4.275, do.  In addition, 4.9.274 (tested
> > > w/o the patch) does not have the problem.
> > > 
> > > >From this pattern there may be some futex fixup patch that was ported
> > > back into 4.9 but failed to make it to 4.4.
> > 
> > Odd, I can't seem to find anything that we missed.  Can you dig to see
> > if there is something that we need to do here so we can resolve this?
> > 
> > thanks,
> > greg k-h
> 
> 
> Hi Greg,
> 
> 4.12 has these apparently-original patches:
> 
>   73d786b  futex: Rework inconsistent rt_mutex/futex_q state
>   cfafcd1  futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()
> 
> I have verified that the first commit, 73d786b, introduces
> the futex_unlock_pi infinite loop bug into 4.12.  I have
> also verified that the last commit, cfafcd1, fixes the bug.
> 
> 4.9 has had both futex patches backported into it.
> Verified that 4.9.276 does not suffer from the bug.
> 
> 4.4 has had the first patch backported, as 394fc49, but
> not the last.  I have verified that building a kernel at
> 394fc49 and at v4.4.276, the bug is seen, and at 394fc49^,
> the bug is not present.
> 
> The missing commit, cfafcd1 in 4.12, is present in 4.9
> as 13c98b0.  A visual spot-check of 13c98b0, as a patch,
> with kernel/futex.c in 4.4.276 did not find any hunks of
> 13c98b0 present in 4.4.276's kernel/futex.c.

Ok, so what do you recommend be done to resolve this?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BUG] 4.4.262: infinite loop in futex_unlock_pi (EAGAIN loop)
  2021-07-28  6:07     ` Greg Kroah-Hartman
@ 2021-07-28 13:51       ` Joe Korty
  2021-07-28 16:15         ` Greg Kroah-Hartman
  0 siblings, 1 reply; 6+ messages in thread
From: Joe Korty @ 2021-07-28 13:51 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Peter Zijlstra, Lee Jones, Steven Rostedt, Thomas Gleixner,
	Sebastian Andrzej Siewior, LKML, Ingo Molnar, Darren Hart,
	Davidlohr Bueso

On Wed, Jul 28, 2021 at 08:07:00AM +0200, Greg Kroah-Hartman wrote:
> On Tue, Jul 27, 2021 at 06:19:50PM -0400, Joe Korty wrote:
> > 
> >  [ Added missing people to the cc: as listed in MAINTAINERS ]
> > 
> > On Thu, Jul 22, 2021 at 04:11:41PM +0200, Greg Kroah-Hartman wrote:
> > > On Mon, Jul 19, 2021 at 12:24:18PM -0400, Joe Korty wrote:
> > > > [BUG] 4.4.262: infinite loop in futex_unlock_pi (EAGAIN loop)
> > > > 
> > > >    [ replicator, attached ]
> > > >    [ workaround patch that crudely clears the loop, attached ]
> > > >    [ 4.4.256 does _not_ have this problem, 4.4.262 is known to have it ]
> > > > 
> > > > When a certain, secure-site application is run on 4.4.262, it locks up and
> > > > is unkillable.  Crash(8) and sysrq backtraces show that the application
> > > > is looping in the kernel in futex_unlock_pi.
> > > > 
> > > > Between 4.4.256 and .257, 4.4 got this 4.12 patch backported into it:
> > > > 
> > > >    73d786b ("[PATCH] futex: Rework inconsistent rt_mutex/futex_q state")
> > > > 
> > > > This patch has the following comment:
> > > > 
> > > >    The only problem is that this breaks RT timeliness guarantees. That
> > > >    is, consider the following scenario:
> > > > 
> > > >       T1 and T2 are both pinned to CPU0. prio(T2) > prio(T1)
> > > > 
> > > >         CPU0
> > > > 
> > > >         T1
> > > >           lock_pi()
> > > >           queue_me()  <- Waiter is visible
> > > >    
> > > >         preemption
> > > > 
> > > >         T2
> > > >           unlock_pi()
> > > >             loops with -EAGAIN forever
> > > > 
> > > >     Which is undesirable for PI primitives. Future patches will rectify
> > > >     this.
> > > > 
> > > > This describes the situation exactly.  To prove, we developed a little
> > > > kernel patch that, on loop detection, puts a message into the kernel log for
> > > > just the first occurrence, keeps a count of the number of occurrences seen
> > > > since boot, and tries to break out of the loop via usleep_range(1000,1000).
> > > > Note that the patch is not really needed for replication.  It merely shows,
> > > > by 'fixing' the problem, that it really is the EAGAIN loop that triggers
> > > > the lockup.
> > > > 
> > > > Along with this patch, we submit a replicator.  Running this replicator
> > > > with this patch, it can be seen that 4.4.256 does not have the problem.
> > > > 4.4.267 and the latest 4.4, 4.4.275, do.  In addition, 4.9.274 (tested
> > > > w/o the patch) does not have the problem.
> > > > 
> > > > >From this pattern there may be some futex fixup patch that was ported
> > > > back into 4.9 but failed to make it to 4.4.
> > > 
> > > Odd, I can't seem to find anything that we missed.  Can you dig to see
> > > if there is something that we need to do here so we can resolve this?
> > > 
> > > thanks,
> > > greg k-h
> > 
> > 
> > Hi Greg,
> > 
> > 4.12 has these apparently-original patches:
> > 
> >   73d786b  futex: Rework inconsistent rt_mutex/futex_q state
> >   cfafcd1  futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()
> > 
> > I have verified that the first commit, 73d786b, introduces
> > the futex_unlock_pi infinite loop bug into 4.12.  I have
> > also verified that the last commit, cfafcd1, fixes the bug.
> > 
> > 4.9 has had both futex patches backported into it.
> > Verified that 4.9.276 does not suffer from the bug.
> > 
> > 4.4 has had the first patch backported, as 394fc49, but
> > not the last.  I have verified that building a kernel at
> > 394fc49 and at v4.4.276, the bug is seen, and at 394fc49^,
> > the bug is not present.
> > 
> > The missing commit, cfafcd1 in 4.12, is present in 4.9
> > as 13c98b0.  A visual spot-check of 13c98b0, as a patch,
> > with kernel/futex.c in 4.4.276 did not find any hunks of
> > 13c98b0 present in 4.4.276's kernel/futex.c.
> 
> Ok, so what do you recommend be done to resolve this?
> 
> thanks,
> greg k-h

I suppose we could either back out 394fc49 from 4.4, or
backport 13c98b0 from 4.9 to 4.4.  At the time I wrote
the above, I hadn't tried either approach yet.

Since then, I did a trial backport of 13c98b0 into 4.4.
All the changes to kernel/futex.c applied, none of the
changes to kernel/locking/rtmutex.c applied.  That implies
to me that we have at least one other patch that needs
finding-n-backporting before we can proceed.

I hope this doesn't turn into a Wack-a-Mole operation...

Joe

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BUG] 4.4.262: infinite loop in futex_unlock_pi (EAGAIN loop)
  2021-07-28 13:51       ` Joe Korty
@ 2021-07-28 16:15         ` Greg Kroah-Hartman
  0 siblings, 0 replies; 6+ messages in thread
From: Greg Kroah-Hartman @ 2021-07-28 16:15 UTC (permalink / raw)
  To: Joe Korty
  Cc: Peter Zijlstra, Lee Jones, Steven Rostedt, Thomas Gleixner,
	Sebastian Andrzej Siewior, LKML, Ingo Molnar, Darren Hart,
	Davidlohr Bueso

On Wed, Jul 28, 2021 at 09:51:14AM -0400, Joe Korty wrote:
> On Wed, Jul 28, 2021 at 08:07:00AM +0200, Greg Kroah-Hartman wrote:
> > On Tue, Jul 27, 2021 at 06:19:50PM -0400, Joe Korty wrote:
> > > 
> > >  [ Added missing people to the cc: as listed in MAINTAINERS ]
> > > 
> > > On Thu, Jul 22, 2021 at 04:11:41PM +0200, Greg Kroah-Hartman wrote:
> > > > On Mon, Jul 19, 2021 at 12:24:18PM -0400, Joe Korty wrote:
> > > > > [BUG] 4.4.262: infinite loop in futex_unlock_pi (EAGAIN loop)
> > > > > 
> > > > >    [ replicator, attached ]
> > > > >    [ workaround patch that crudely clears the loop, attached ]
> > > > >    [ 4.4.256 does _not_ have this problem, 4.4.262 is known to have it ]
> > > > > 
> > > > > When a certain, secure-site application is run on 4.4.262, it locks up and
> > > > > is unkillable.  Crash(8) and sysrq backtraces show that the application
> > > > > is looping in the kernel in futex_unlock_pi.
> > > > > 
> > > > > Between 4.4.256 and .257, 4.4 got this 4.12 patch backported into it:
> > > > > 
> > > > >    73d786b ("[PATCH] futex: Rework inconsistent rt_mutex/futex_q state")
> > > > > 
> > > > > This patch has the following comment:
> > > > > 
> > > > >    The only problem is that this breaks RT timeliness guarantees. That
> > > > >    is, consider the following scenario:
> > > > > 
> > > > >       T1 and T2 are both pinned to CPU0. prio(T2) > prio(T1)
> > > > > 
> > > > >         CPU0
> > > > > 
> > > > >         T1
> > > > >           lock_pi()
> > > > >           queue_me()  <- Waiter is visible
> > > > >    
> > > > >         preemption
> > > > > 
> > > > >         T2
> > > > >           unlock_pi()
> > > > >             loops with -EAGAIN forever
> > > > > 
> > > > >     Which is undesirable for PI primitives. Future patches will rectify
> > > > >     this.
> > > > > 
> > > > > This describes the situation exactly.  To prove, we developed a little
> > > > > kernel patch that, on loop detection, puts a message into the kernel log for
> > > > > just the first occurrence, keeps a count of the number of occurrences seen
> > > > > since boot, and tries to break out of the loop via usleep_range(1000,1000).
> > > > > Note that the patch is not really needed for replication.  It merely shows,
> > > > > by 'fixing' the problem, that it really is the EAGAIN loop that triggers
> > > > > the lockup.
> > > > > 
> > > > > Along with this patch, we submit a replicator.  Running this replicator
> > > > > with this patch, it can be seen that 4.4.256 does not have the problem.
> > > > > 4.4.267 and the latest 4.4, 4.4.275, do.  In addition, 4.9.274 (tested
> > > > > w/o the patch) does not have the problem.
> > > > > 
> > > > > >From this pattern there may be some futex fixup patch that was ported
> > > > > back into 4.9 but failed to make it to 4.4.
> > > > 
> > > > Odd, I can't seem to find anything that we missed.  Can you dig to see
> > > > if there is something that we need to do here so we can resolve this?
> > > > 
> > > > thanks,
> > > > greg k-h
> > > 
> > > 
> > > Hi Greg,
> > > 
> > > 4.12 has these apparently-original patches:
> > > 
> > >   73d786b  futex: Rework inconsistent rt_mutex/futex_q state
> > >   cfafcd1  futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()
> > > 
> > > I have verified that the first commit, 73d786b, introduces
> > > the futex_unlock_pi infinite loop bug into 4.12.  I have
> > > also verified that the last commit, cfafcd1, fixes the bug.
> > > 
> > > 4.9 has had both futex patches backported into it.
> > > Verified that 4.9.276 does not suffer from the bug.
> > > 
> > > 4.4 has had the first patch backported, as 394fc49, but
> > > not the last.  I have verified that building a kernel at
> > > 394fc49 and at v4.4.276, the bug is seen, and at 394fc49^,
> > > the bug is not present.
> > > 
> > > The missing commit, cfafcd1 in 4.12, is present in 4.9
> > > as 13c98b0.  A visual spot-check of 13c98b0, as a patch,
> > > with kernel/futex.c in 4.4.276 did not find any hunks of
> > > 13c98b0 present in 4.4.276's kernel/futex.c.
> > 
> > Ok, so what do you recommend be done to resolve this?
> > 
> > thanks,
> > greg k-h
> 
> I suppose we could either back out 394fc49 from 4.4, or
> backport 13c98b0 from 4.9 to 4.4.  At the time I wrote
> the above, I hadn't tried either approach yet.
> 
> Since then, I did a trial backport of 13c98b0 into 4.4.
> All the changes to kernel/futex.c applied, none of the
> changes to kernel/locking/rtmutex.c applied.  That implies
> to me that we have at least one other patch that needs
> finding-n-backporting before we can proceed.

Ok, let me know if there's anything I can apply here after you test
things.

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-07-28 16:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-19 16:24 [BUG] 4.4.262: infinite loop in futex_unlock_pi (EAGAIN loop) Joe Korty
2021-07-22 14:11 ` Greg Kroah-Hartman
2021-07-27 22:19   ` Joe Korty
2021-07-28  6:07     ` Greg Kroah-Hartman
2021-07-28 13:51       ` Joe Korty
2021-07-28 16:15         ` Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).