linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sebastian Sewior <bigeasy@linutronix.de>
To: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-s390@vger.kernel.org, Stefan Liebler <stli@linux.ibm.com>
Subject: Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggered
Date: Tue, 29 Jan 2019 18:16:53 +0100	[thread overview]
Message-ID: <20190129171653.ycl64psq2liy5o5c@linutronix.de> (raw)
In-Reply-To: <20190129151058.GG26906@osiris>

On 2019-01-29 16:10:58 [+0100], Heiko Carstens wrote:
> Finally... the trace output is quite large with 26 MB... Therefore an
> xz compressed attachment. Hope that's ok.
> 
> The kernel used was linux-next 20190129 + your patch.
|        ld64.so.1-10237 [006] .... 14232.031726: sys_futex(uaddr: 3ff88e80618, op: 7, val: 3ff00000007, utime: 3ff88e7f910, uaddr2: 3ff88e7f910, val3: 3ffc167e8d7)
FUTEX_UNLOCK_PI | SHARED

|        ld64.so.1-10237 [006] .... 14232.031726: sys_futex -> 0x0
…
|        ld64.so.1-10237 [006] .... 14232.051751: sched_process_exit: comm=ld64.so.1 pid=10237 prio=120
…
|        ld64.so.1-10148 [006] .... 14232.061826: sys_futex(uaddr: 3ff88e80618, op: 6, val: 1, utime: 0, uaddr2: 2, val3: 0)
FUTEX_LOCK_PI | SHARED

|        ld64.so.1-10148 [006] .... 14232.061826: sys_futex -> 0xfffffffffffffffd

So there got to be another task that acquired the lock in userland and
left since the last in kernel-user unlocked it. This might bring more
light to it:

diff --git a/kernel/futex.c b/kernel/futex.c
index 599da35c2768..aaa782a8a115 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1209,6 +1209,9 @@ static int handle_exit_race(u32 __user *uaddr, u32 uval,
 	 * corrupted or the user space value in *uaddr is simply bogus.
 	 * Give up and tell user space.
 	 */
+	trace_printk("uval2 vs uval %08x vs %08x (%d)\n", uval2, uval,
+		     tsk ? tsk->pid : -1);
+	__WARN();
 	return -ESRCH;
 }
 
@@ -1233,8 +1236,10 @@ static int attach_to_pi_owner(u32 __user *uaddr, u32 uval, union futex_key *key,
 	if (!pid)
 		return -EAGAIN;
 	p = find_get_task_by_vpid(pid);
-	if (!p)
+	if (!p) {
+		trace_printk("Missing pid %d\n", pid);
 		return handle_exit_race(uaddr, uval, NULL);
+	}
 
 	if (unlikely(p->flags & PF_KTHREAD)) {
 		put_task_struct(p);

---

I am not sure, but isn't this the "known" issue where the kernel drops
ESRCH in a valid case and glibc upstream does not recognize it because
it is not a valid /POSIX-defined error code? (I *think* same is true for
-ENOMEM) If it is, the following C snippet is a small tc:

---->8------
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
#include <stdio.h>

static char nothing[4096];

int main(void)
{
	int fd;
	ssize_t wn;
	void *lockm;
	pid_t child;
	pthread_mutex_t *the_lock;
	pthread_mutexattr_t mutexattr;
	int ret;

	fd = open("/dev/shm/futex-test-lock", O_RDWR | O_CREAT | O_TRUNC, 0644);
	if (fd < 0) {
		printf("Failed to create lock file: %m\n");
		return 1;
	}
	wn = write(fd, nothing, sizeof(nothing));
	if (wn != sizeof(nothing)) {
		printf("Failed to write to file: %m\n");
		goto out_unlink;
	}

	lockm = mmap(NULL, sizeof(nothing), PROT_READ | PROT_WRITE, MAP_SHARED,
		     fd, 0);
	if (lockm == MAP_FAILED) {
		printf("mmap() failed: %m\n");
		goto out_unlink;
	}
	close(fd);
	unlink("/dev/shm/futex-test-lock");
	the_lock = lockm;

	ret = pthread_mutexattr_init(&mutexattr);
	ret |= pthread_mutexattr_setpshared(&mutexattr, PTHREAD_PROCESS_SHARED);
	ret |= pthread_mutexattr_setprotocol(&mutexattr, PTHREAD_PRIO_INHERIT);

	if (ret) {
		printf("Something went wrong during init\n");
		return 1;
	}

	ret = pthread_mutex_init(the_lock, &mutexattr);
	if (ret) {
		printf("Failed to init the lock\n");
		return 1;
	}
	child = fork();
	if (child < 0) {
		printf("fork(): %m\n");
		return 1;
	}

	if (!child) {
		pthread_mutex_lock(the_lock);
		exit(2);
	}

	sleep(2);
	ret = pthread_mutex_lock(the_lock);

	printf("-> %x\n", ret);
	return 0;

out_unlink:
	unlink("/dev/shm/futex-test-lock");
	return 1;
}

---------------8<-----------------------

strace gives this:
|openat(AT_FDCWD, "/dev/shm/futex-test-lock", O_RDWR|O_CREAT|O_TRUNC, 0644) = 3
|write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
|mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f5e23e37000
|close(3)                                = 0
|unlink("/dev/shm/futex-test-lock")      = 0
…
|clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f5e23c1da10) = 25777
|strace: Process 25777 attached
|[pid 25776] nanosleep({tv_sec=2, tv_nsec=0},  <unfinished ...>
|[pid 25777] set_robust_list(0x7f5e23c1da20, 24) = 0
|[pid 25777] exit_group(2)               = ?
|[pid 25777] +++ exited with 2 +++
|<... nanosleep resumed> {tv_sec=1, tv_nsec=999821679}) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
|--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=25777, si_uid=1000, si_status=2, si_utime=0, si_stime=0} ---
|restart_syscall(<... resuming interrupted nanosleep ...>) = 0
|futex(0x7f5e23e37000, FUTEX_LOCK_PI, NULL) = -1 ESRCH (No such process)
|pause(^Cstrace: Process 25776 detached

and if I remember correctly, if asserts are not enabled we end up with a
pause loop instead.

Sebastian

  parent reply	other threads:[~2019-01-29 17:17 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-27  8:11 WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggered Heiko Carstens
2018-11-28 14:32 ` Thomas Gleixner
2018-11-29 11:23   ` Heiko Carstens
2019-01-21 12:21     ` Heiko Carstens
2019-01-21 13:12       ` Thomas Gleixner
2019-01-22 21:14         ` Thomas Gleixner
2019-01-23  9:24           ` Heiko Carstens
2019-01-23 12:33             ` Thomas Gleixner
2019-01-23 12:40               ` Heiko Carstens
2019-01-28 13:44     ` Peter Zijlstra
2019-01-28 13:58       ` Peter Zijlstra
2019-01-28 15:53         ` Thomas Gleixner
2019-01-29  8:49           ` Peter Zijlstra
2019-01-29 22:15             ` [PATCH] futex: Handle early deadlock return correctly Thomas Gleixner
2019-01-30 12:01               ` Thomas Gleixner
2019-02-08 12:05               ` [tip:locking/urgent] " tip-bot for Thomas Gleixner
2019-01-29  9:01           ` WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggered Heiko Carstens
2019-01-29  9:33             ` Peter Zijlstra
2019-01-29  9:45             ` Thomas Gleixner
2019-01-29 10:24               ` Heiko Carstens
2019-01-29 10:35                 ` Peter Zijlstra
2019-01-29 13:03                   ` Thomas Gleixner
2019-01-29 13:23                     ` Heiko Carstens
     [not found]                       ` <20190129151058.GG26906@osiris>
2019-01-29 17:16                         ` Sebastian Sewior [this message]
2019-01-29 21:45                           ` Thomas Gleixner
     [not found]                           ` <20190130094913.GC5299@osiris>
2019-01-30 12:15                             ` Thomas Gleixner
     [not found]                               ` <20190130125955.GD5299@osiris>
2019-01-30 13:24                                 ` Sebastian Sewior
2019-01-30 13:29                                   ` Thomas Gleixner
2019-01-30 14:33                                     ` Thomas Gleixner
2019-01-30 17:56                                       ` Thomas Gleixner
2019-01-30 21:07                                         ` Sebastian Sewior
2019-01-30 23:13                                           ` WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede Thomas Gleixner
2019-01-30 23:35                                             ` Paul E. McKenney
2019-01-30 23:55                                               ` Thomas Gleixner
2019-01-31  0:27                                                 ` Thomas Gleixner
2019-01-31  1:45                                                   ` Paul E. McKenney
2019-01-31 16:52                                                   ` Heiko Carstens
2019-01-31 17:06                                                     ` Sebastian Sewior
2019-01-31 20:42                                                       ` Heiko Carstens
2019-02-01 16:12                                                       ` Heiko Carstens
2019-02-01 21:59                                                         ` Thomas Gleixner
     [not found]                                                           ` <20190202091043.GA3381@osiris>
2019-02-02 10:14                                                             ` Thomas Gleixner
2019-02-02 11:20                                                               ` Heiko Carstens
2019-02-03 16:30                                                                 ` Thomas Gleixner
2019-02-04 11:40                                                                   ` Heiko Carstens
2019-01-31  1:44                                                 ` Paul E. McKenney
2019-01-30 13:25                                 ` WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggered Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190129171653.ycl64psq2liy5o5c@linutronix.de \
    --to=bigeasy@linutronix.de \
    --cc=heiko.carstens@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=schwidefsky@de.ibm.com \
    --cc=stli@linux.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).