linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] pipe: prevent compiler reordering in pipe_poll
@ 2018-08-24 22:54 Eric Wong
  2018-08-24 23:05 ` Al Viro
  2018-09-10  8:55 ` Paolo Bonzini
  0 siblings, 2 replies; 3+ messages in thread
From: Eric Wong @ 2018-08-24 22:54 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, linux-kernel, Paolo Bonzini

The pipe_poll function does not use locks, and adding an entry
to the waitqueue is not guaranteed to happen before pipe->nrbufs
(or other fields) are read, leading to missed wakeups.

Looking at Ruby CI build logs and backtraces, I've noticed
occasional instances where processes are stuck in select(2) or
ppoll(2) with a pipe.

I don't have access to the systems where this is happening to
test/reproduce the problem, and haven't been able to reproduce
it locally on less-powerful hardware, either.  However, it seems
like a problem based on similar comments in
fs/eventfd.c::eventfd_poll made by Paolo.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
---
 fs/pipe.c | 32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/fs/pipe.c b/fs/pipe.c
index 39d6f431da83..1a904d941cf1 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -509,7 +509,7 @@ static long pipe_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 	}
 }
 
-/* No kernel lock held - fine */
+/* No kernel lock held - fine, but a compiler barrier is required */
 static __poll_t
 pipe_poll(struct file *filp, poll_table *wait)
 {
@@ -519,7 +519,35 @@ pipe_poll(struct file *filp, poll_table *wait)
 
 	poll_wait(filp, &pipe->wait, wait);
 
-	/* Reading only -- no need for acquiring the semaphore.  */
+	/*
+	 * Reading only -- no need for acquiring the semaphore, but
+	 * we need a compiler barrier to ensure the compiler does
+	 * not reorder reads to pipe->nrbufs, pipe->writers,
+	 * pipe->readers, filp->f_version, pipe->w_counter, and
+	 * pipe->buffers before poll_wait to avoid missing wakeups
+	 * from compiler reordering.  In other words, we need to
+	 * prevent the following situation:
+	 *
+	 * pipe_poll                          pipe_write
+	 * -----------------                  ------------
+	 * nrbufs = pipe->nrbufs (INVALID!)
+	 *
+	 *                                    __pipe_lock
+	 *                                    pipe->nrbufs = ++bufs;
+	 *                                    __pipe_unlock
+	 *                                    wake_up_interruptible_sync_poll
+	 *                                      pipe->wait is empty, no wakeup
+	 *
+	 * lock pipe->wait.lock (in poll_wait)
+	 * __add_wait_queue
+	 * unlock pipe->wait.lock
+	 *
+	 *  // pipe->nrbufs should be read here, NOT above
+	 *
+	 * pipe_poll returns 0 (WRONG)
+	 */
+	barrier();
+
 	nrbufs = pipe->nrbufs;
 	mask = 0;
 	if (filp->f_mode & FMODE_READ) {
-- 
EW

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC] pipe: prevent compiler reordering in pipe_poll
  2018-08-24 22:54 [RFC] pipe: prevent compiler reordering in pipe_poll Eric Wong
@ 2018-08-24 23:05 ` Al Viro
  2018-09-10  8:55 ` Paolo Bonzini
  1 sibling, 0 replies; 3+ messages in thread
From: Al Viro @ 2018-08-24 23:05 UTC (permalink / raw)
  To: Eric Wong; +Cc: linux-fsdevel, linux-kernel, Paolo Bonzini

On Fri, Aug 24, 2018 at 10:54:31PM +0000, Eric Wong wrote:
> The pipe_poll function does not use locks, and adding an entry
> to the waitqueue is not guaranteed to happen before pipe->nrbufs
> (or other fields) are read, leading to missed wakeups.
> 
> Looking at Ruby CI build logs and backtraces, I've noticed
> occasional instances where processes are stuck in select(2) or
> ppoll(2) with a pipe.
> 
> I don't have access to the systems where this is happening to
> test/reproduce the problem, and haven't been able to reproduce
> it locally on less-powerful hardware, either.  However, it seems
> like a problem based on similar comments in
> fs/eventfd.c::eventfd_poll made by Paolo.

You are misinterpreting those comments.  That load *can't* migrate
to anything earlier than taking queue lock, since that acts as
an acquire barrier.  READ_ONCE in eventfd_poll() is not to prevent
compiler reordering - it's to prevent insane compiler doing multiple
loads on possibly changing variable (ctx->count there).

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] pipe: prevent compiler reordering in pipe_poll
  2018-08-24 22:54 [RFC] pipe: prevent compiler reordering in pipe_poll Eric Wong
  2018-08-24 23:05 ` Al Viro
@ 2018-09-10  8:55 ` Paolo Bonzini
  1 sibling, 0 replies; 3+ messages in thread
From: Paolo Bonzini @ 2018-09-10  8:55 UTC (permalink / raw)
  To: Eric Wong, Al Viro; +Cc: linux-fsdevel, linux-kernel

On 25/08/2018 00:54, Eric Wong wrote:
> The pipe_poll function does not use locks, and adding an entry
> to the waitqueue is not guaranteed to happen before pipe->nrbufs
> (or other fields) are read, leading to missed wakeups.
> 
> Looking at Ruby CI build logs and backtraces, I've noticed
> occasional instances where processes are stuck in select(2) or
> ppoll(2) with a pipe.
> 
> I don't have access to the systems where this is happening to
> test/reproduce the problem, and haven't been able to reproduce
> it locally on less-powerful hardware, either.  However, it seems
> like a problem based on similar comments in
> fs/eventfd.c::eventfd_poll made by Paolo.

The documentation change can be useful, but if you add a compiler
barrier you should also mention why reordering at the processor level is
okay.  In this case, processor-level reordering is okay because (just
like in fs/eventfd.c) poll_wait acts as an acquire barrier.

*However* I would be surprised if the scenario (even the one in
fs/eventfd.c) can actually happen, and I don't think the compiler
barrier is useful; there's no reason why the compiler should think that
it can hoist the reads above poll_wait.

In fact, there is a big difference between READ_ONCE() and barrier() for
whoever reads the code, which makes the code after your patch worse than
before.  READ_ONCE() means "I know I am accessing this variable outside
a lock".  barrier() means one of two things: 1) "I know what I am doing
can trick the compiler, and I don't want that to happen"; 2) "I am
synchronizing against other things happening on this CPU" such as
interrupts.  In this case you are not doing any of the two.

Paolo


> Signed-off-by: Eric Wong <normalperson@yhbt.net>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  fs/pipe.c | 32 ++++++++++++++++++++++++++++++--
>  1 file changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/pipe.c b/fs/pipe.c
> index 39d6f431da83..1a904d941cf1 100644
> --- a/fs/pipe.c
> +++ b/fs/pipe.c
> @@ -509,7 +509,7 @@ static long pipe_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
>  	}
>  }
>  
> -/* No kernel lock held - fine */
> +/* No kernel lock held - fine, but a compiler barrier is required */
>  static __poll_t
>  pipe_poll(struct file *filp, poll_table *wait)
>  {
> @@ -519,7 +519,35 @@ pipe_poll(struct file *filp, poll_table *wait)
>  
>  	poll_wait(filp, &pipe->wait, wait);
>  
> -	/* Reading only -- no need for acquiring the semaphore.  */
> +	/*
> +	 * Reading only -- no need for acquiring the semaphore, but
> +	 * we need a compiler barrier to ensure the compiler does
> +	 * not reorder reads to pipe->nrbufs, pipe->writers,
> +	 * pipe->readers, filp->f_version, pipe->w_counter, and
> +	 * pipe->buffers before poll_wait to avoid missing wakeups
> +	 * from compiler reordering.  In other words, we need to
> +	 * prevent the following situation:
> +	 *
> +	 * pipe_poll                          pipe_write
> +	 * -----------------                  ------------
> +	 * nrbufs = pipe->nrbufs (INVALID!)
> +	 *
> +	 *                                    __pipe_lock
> +	 *                                    pipe->nrbufs = ++bufs;
> +	 *                                    __pipe_unlock
> +	 *                                    wake_up_interruptible_sync_poll
> +	 *                                      pipe->wait is empty, no wakeup
> +	 *
> +	 * lock pipe->wait.lock (in poll_wait)
> +	 * __add_wait_queue
> +	 * unlock pipe->wait.lock
> +	 *
> +	 *  // pipe->nrbufs should be read here, NOT above
> +	 *
> +	 * pipe_poll returns 0 (WRONG)
> +	 */
> +	barrier();
> +
>  	nrbufs = pipe->nrbufs;
>  	mask = 0;
>  	if (filp->f_mode & FMODE_READ) {
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-09-10  8:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-24 22:54 [RFC] pipe: prevent compiler reordering in pipe_poll Eric Wong
2018-08-24 23:05 ` Al Viro
2018-09-10  8:55 ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).