From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753580AbbK2WEy (ORCPT ); Sun, 29 Nov 2015 17:04:54 -0500 Received: from wtarreau.pck.nerim.net ([62.212.114.60]:56526 "EHLO 1wt.eu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753358AbbK2WBd (ORCPT ); Sun, 29 Nov 2015 17:01:33 -0500 Message-Id: <20151129214704.108386928@1wt.eu> User-Agent: quilt/0.63-1 Date: Sun, 29 Nov 2015 22:47:31 +0100 From: Willy Tarreau To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Kosuke Tatsukawa , Greg Kroah-Hartman , Ben Hutchings , Willy Tarreau Subject: [PATCH 2.6.32 29/38] [PATCH 29/38] tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15 In-Reply-To: <8acf8256ccc72771a80b7851061027bc@local> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 2.6.32-longterm review patch. If anyone has any objections, please let me know. ------------------ commit e81107d4c6bd098878af9796b24edc8d4a9524fd upstream. My colleague ran into a program stall on a x86_64 server, where n_tty_read() was waiting for data even if there was data in the buffer in the pty. kernel stack for the stuck process looks like below. #0 [ffff88303d107b58] __schedule at ffffffff815c4b20 #1 [ffff88303d107bd0] schedule at ffffffff815c513e #2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818 #3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2 #4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23 #5 [ffff88303d107dd0] tty_read at ffffffff81368013 #6 [ffff88303d107e20] __vfs_read at ffffffff811a3704 #7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57 #8 [ffff88303d107f00] sys_read at ffffffff811a4306 #9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7 There seems to be two problems causing this issue. First, in drivers/tty/n_tty.c, __receive_buf() stores the data and updates ldata->commit_head using smp_store_release() and then checks the wait queue using waitqueue_active(). However, since there is no memory barrier, __receive_buf() could return without calling wake_up_interactive_poll(), and at the same time, n_tty_read() could start to wait in wait_woken() as in the following chart. __receive_buf() n_tty_read() ------------------------------------------------------------------------ if (waitqueue_active(&tty->read_wait)) /* Memory operations issued after the RELEASE may be completed before the RELEASE operation has completed */ add_wait_queue(&tty->read_wait, &wait); ... if (!input_available_p(tty, 0)) { smp_store_release(&ldata->commit_head, ldata->read_head); ... timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout); ------------------------------------------------------------------------ The second problem is that n_tty_read() also lacks a memory barrier call and could also cause __receive_buf() to return without calling wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken() as in the chart below. __receive_buf() n_tty_read() ------------------------------------------------------------------------ spin_lock_irqsave(&q->lock, flags); /* from add_wait_queue() */ ... if (!input_available_p(tty, 0)) { /* Memory operations issued after the RELEASE may be completed before the RELEASE operation has completed */ smp_store_release(&ldata->commit_head, ldata->read_head); if (waitqueue_active(&tty->read_wait)) __add_wait_queue(q, wait); spin_unlock_irqrestore(&q->lock,flags); /* from add_wait_queue() */ ... timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout); ------------------------------------------------------------------------ There are also other places in drivers/tty/n_tty.c which have similar calls to waitqueue_active(), so instead of adding many memory barrier calls, this patch simply removes the call to waitqueue_active(), leaving just wake_up*() behind. This fixes both problems because, even though the memory access before or after the spinlocks in both wake_up*() and add_wait_queue() can sneak into the critical section, it cannot go past it and the critical section assures that they will be serialized (please see "INTER-CPU ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a better explanation). Moreover, the resulting code is much simpler. Latency measurement using a ping-pong test over a pty doesn't show any visible performance drop. Signed-off-by: Kosuke Tatsukawa Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.2: - Use wake_up_interruptible(), not wake_up_interruptible_poll() - There are only two spurious uses of waitqueue_active() to remove] Signed-off-by: Ben Hutchings (cherry picked from commit 80910ccdd3ee35e4131df38bc73b86ee60abdf0b) [wt: file is drivers/char/n_tty.c in 2.6.32] Signed-off-by: Willy Tarreau --- drivers/char/n_tty.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/char/n_tty.c b/drivers/char/n_tty.c index 5269fa0..34be7fd 100644 --- a/drivers/char/n_tty.c +++ b/drivers/char/n_tty.c @@ -1287,8 +1287,7 @@ handle_newline: tty->canon_data++; spin_unlock_irqrestore(&tty->read_lock, flags); kill_fasync(&tty->fasync, SIGIO, POLL_IN); - if (waitqueue_active(&tty->read_wait)) - wake_up_interruptible(&tty->read_wait); + wake_up_interruptible(&tty->read_wait); return; } } @@ -1410,8 +1409,7 @@ static void n_tty_receive_buf(struct tty_struct *tty, const unsigned char *cp, if (!tty->icanon && (tty->read_cnt >= tty->minimum_to_wake)) { kill_fasync(&tty->fasync, SIGIO, POLL_IN); - if (waitqueue_active(&tty->read_wait)) - wake_up_interruptible(&tty->read_wait); + wake_up_interruptible(&tty->read_wait); } /* -- 1.7.12.2.21.g234cd45.dirty