linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Torvald Riegel <triegel@redhat.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>,
	"Carlos O'Donell" <carlos@redhat.com>,
	Darren Hart <dvhart@linux.intel.com>, Ingo Molnar <mingo@elte.hu>,
	Jakub Jelinek <jakub@redhat.com>,
	"linux-man@vger.kernel.org" <linux-man@vger.kernel.org>,
	lkml <linux-kernel@vger.kernel.org>,
	Davidlohr Bueso <davidlohr.bueso@hp.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Steven Rostedt <rostedt@goodmis.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Linux API <linux-api@vger.kernel.org>,
	Darren Hart <dvhart@infradead.org>,
	Anton Blanchard <anton@samba.org>, Petr Baudis <pasky@suse.cz>,
	Eric Dumazet <edumazet@google.com>,
	bill o gallmeister <bgallmeister@gmail.com>,
	Jan Kiszka <jan.kiszka@siemens.com>,
	Daniel Wagner <wagi@monom.org>, Rich Felker <dalias@libc.org>
Subject: Re: futex(2) man page update help request
Date: Sat, 24 Jan 2015 12:35:01 +0100 (CET)	[thread overview]
Message-ID: <alpine.DEB.2.11.1501241116160.5526@nanos> (raw)
In-Reply-To: <1422037788.29655.0.camel@triegel.csb>

On Fri, 23 Jan 2015, Torvald Riegel wrote:
> Second, the current documentation for EINTR is that it can happen due to
> receiving a signal *or* due to a spurious wake-up.  This is difficult to

I don't think so. I went through all callchains again with a fine comb.

futex_wait()
retry:
	ret = futex_wait_setup();
	if (ret) {
		 /*
		  * Possible return codes related to uaddr:
		  * -EINVAL:    Not u32 aligned uaddr
		  * -EFAULT:    No mapping, no RW
		  * -ENOMEM:    Paging ran out of memory
		  * -EHWPOISON: Memory hardware error
		  *
		  * Others:
		  * -EWOULDBLOCK: value at uaddr has changed
		  */
		return ret;
	}

	futex_wait_queue_me();

	if (woken by futex_wake/requeue)
	   	return 0;

	if (timeout)
		return -ETIMEOUT;

	/*
	 * Spurious wakeup, i.e. no signal pending
	 */
	if (!signal_pending())
		goto retry;

	/* Handled in the low level syscall exit code */
	if (!timed_wait)
		return -ERESTARTSYS;
	else
		return -ERESTARTBLOCK;

Now in the low level syscall exit we try to deliver the signal

	if (!signal_delivered())
	      restart_syscall();

	if (sigaction->flags & SA_RESTART)
	      restart_syscall();

	ret_to_userspace -EINTR;

So we should never see -EINTR in the case of a spurious wakeup here.

But, here is the not so good news:

 I did some archaeology. The restart handling of futex_wait() got
 introduced in kernel 2.6.22, so anything older than that will have
 the spurious -EINTR issues.

futex_wait_pi() always had the restart handling and glibc folks back
then (2006) requested that it should never return -EINTR, so it
unconditionally restarts the syscall whether a signal had been
delivered or not.

So kernels >= 2.6.22 should never return -EINTR spuriously. If that
happens it's a bug and needs to be fixed.

> Third, I think it would be useful to -- somewhere -- explain which
> behavior the futex operations would have conceptually when expressed by
> C11 code.  We currently say that they wake up, sleep, etc, and which
> values they return.  But we never say how to properly synchronize with
> them on the userspace side.  The C11 memory model is probably the best
> model to use on the userspace side, so that's why I'm arguing for this.
> Basically, I think we need to (1) tell people that they should use
> memory_order_relaxed accesses to the futex variable (ie, the memory
> location associated with the whole futex construct on the kernel side --
> or do we have another name for this?), and (2) give some conceptual
> guarantees for the kernel-side synchronization so that one use this to
> derive how to use them correctly in userspace.
> 
> The man pages might not be the right place for this, and maybe we just
> need a revision of "Futexes are tricky".  If you have other suggestions
> for where to document this, or on the content, let me know.  (I'm also
> willing to spend time on this :) ).

The current futex code in the kernel has gained documentation about
the required memory ordering recently. That should be a good starting
point.

Thanks,

	tglx

  reply	other threads:[~2015-01-24 11:35 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-14 10:35 futex(2) man page update help request Michael Kerrisk (man-pages)
2014-05-14 16:18 ` Darren Hart
2014-05-14 19:03   ` Michael Kerrisk (man-pages)
2014-05-14 19:59     ` Darren Hart
2014-05-14 20:23     ` Carlos O'Donell
2014-05-14 20:44       ` Andy Lutomirski
2014-05-14 23:34       ` Thomas Gleixner
2014-05-15  3:12         ` Carlos O'Donell
2014-05-15  4:49           ` Michael Kerrisk (man-pages)
2014-05-15  4:53         ` Michael Kerrisk (man-pages)
2014-05-15 14:14           ` Thomas Gleixner
2014-05-15 20:19             ` Michael Kerrisk (man-pages)
2014-08-04 14:46               ` Carlos O'Donell
2014-05-15 20:35             ` Darren Hart
2015-01-15 15:12               ` Michael Kerrisk (man-pages)
2015-01-17  1:33                 ` Darren Hart
2015-01-17  9:16                   ` Michael Kerrisk (man-pages)
2015-01-17 19:26                     ` Darren Hart
2015-01-18 10:18                       ` Michael Kerrisk (man-pages)
2015-01-15 15:10             ` Michael Kerrisk (man-pages)
2015-01-15 22:23               ` Thomas Gleixner
2015-01-16 15:17                 ` Michael Kerrisk (man-pages)
2015-01-16 15:20                   ` Thomas Gleixner
2015-01-16 20:54                     ` Michael Kerrisk (man-pages)
2015-01-17  0:46                       ` Darren Hart
2015-01-19 10:45                         ` Thomas Gleixner
2015-01-19 14:07                           ` Michael Kerrisk (man-pages)
2015-01-23 18:19                         ` Torvald Riegel
2015-01-24 10:05                           ` Thomas Gleixner
2015-01-24 12:58                             ` Torvald Riegel
2015-01-24 16:25                               ` Thomas Gleixner
2015-01-17  0:56                       ` Davidlohr Bueso
2015-01-17  1:11                         ` Darren Hart
2015-01-23 18:29               ` Torvald Riegel
2015-01-24 11:35                 ` Thomas Gleixner [this message]
2015-01-24 13:12                   ` Torvald Riegel
2015-01-27  7:48                     ` Michael Kerrisk (man-pages)
2015-02-05 19:57                   ` Darren Hart
2014-05-15  8:13       ` Peter Zijlstra
2014-05-15 15:43         ` Darren Hart
2014-05-15  8:14       ` Peter Zijlstra
2014-05-15 13:18         ` Carlos O'Donell
2014-05-15 13:22           ` Peter Zijlstra
2014-05-15 13:49             ` Michael Kerrisk (man-pages)
2014-05-15 13:55               ` Peter Zijlstra
2014-05-15 14:39               ` Carlos O'Donell
2014-05-15 15:11                 ` Peter Zijlstra
2014-05-14 20:56     ` Davidlohr Bueso
2014-05-14 21:03       ` Darren Hart
2014-05-14 22:21         ` Paul E. McKenney
2014-05-15  0:28       ` H. Peter Anvin
2014-05-15  0:35         ` Andy Lutomirski
2014-05-15  0:41           ` H. Peter Anvin
2014-05-15 19:10         ` Carlos O'Donell
2014-05-14 21:05   ` Davidlohr Bueso
2014-05-15 15:15     ` Joseph S. Myers
2014-05-15  0:18   ` H. Peter Anvin
2014-05-15  5:21     ` Darren Hart
2014-05-15  8:23       ` Peter Zijlstra
2014-05-15 13:46       ` Michael Kerrisk (man-pages)
2014-05-15 14:59         ` H. Peter Anvin
2014-05-15 15:42         ` chrubis
2014-05-15 15:52           ` H. Peter Anvin
2014-05-15 16:01             ` chrubis
2014-05-15 16:07               ` H. Peter Anvin
2014-05-15 16:17                 ` chrubis
2014-05-15 16:56                   ` H. Peter Anvin
2014-05-15 17:06                     ` chrubis
2014-05-15 15:47         ` Darren Hart
2014-05-15 15:35     ` chrubis
2014-05-15 15:28   ` chrubis
2014-05-15 15:40     ` Steven Rostedt
2014-05-15 16:14     ` Darren Hart
2014-05-15 16:30       ` chrubis
2014-05-15 18:17         ` Darren Hart
2014-05-15 19:05           ` chrubis
2014-05-15 19:38             ` Darren Hart
2014-08-11 10:19               ` chrubis
2014-11-26 13:41               ` Cyril Hrubis
2015-02-16 13:14               ` Cyril Hrubis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.11.1501241116160.5526@nanos \
    --to=tglx@linutronix.de \
    --cc=anton@samba.org \
    --cc=arnd@arndb.de \
    --cc=bgallmeister@gmail.com \
    --cc=carlos@redhat.com \
    --cc=dalias@libc.org \
    --cc=davidlohr.bueso@hp.com \
    --cc=dvhart@infradead.org \
    --cc=dvhart@linux.intel.com \
    --cc=edumazet@google.com \
    --cc=jakub@redhat.com \
    --cc=jan.kiszka@siemens.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-man@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mtk.manpages@gmail.com \
    --cc=pasky@suse.cz \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=triegel@redhat.com \
    --cc=wagi@monom.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).