From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 090D9C43381 for ; Tue, 19 Feb 2019 07:03:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CC3EE21848 for ; Tue, 19 Feb 2019 07:03:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=dektech.com.au header.i=@dektech.com.au header.b="AYsISKMb" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726845AbfBSHDY (ORCPT ); Tue, 19 Feb 2019 02:03:24 -0500 Received: from f0-dek.dektech.com.au ([210.10.221.142]:39481 "EHLO mail.dektech.com.au" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725771AbfBSHDW (ORCPT ); Tue, 19 Feb 2019 02:03:22 -0500 Received: from localhost (localhost [127.0.0.1]) by mail.dektech.com.au (Postfix) with ESMTP id 9FAFCF9243; Tue, 19 Feb 2019 18:03:19 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=dektech.com.au; h=references:in-reply-to:x-mailer:message-id:date:date:subject :subject:from:from:received:received:received; s=mail_dkim; t= 1550559799; bh=l3QI32tl8YnsZFLeLJ+xK9GfPPX1TD0w/SMd/gLslWY=; b=A YsISKMb2lNGYwVNH9J6aPoTAlcXmqmRZ3oN7CX6Vg7zUQ/e2m9xHDL0RpBCtv/Fm v9dMe/yNv57IbtP88H6Gt4Opvp7rN6D3JoeOiLRMphyg8LrE8gquKV63vkT3+yBm Nxgo74hizbQ1XCK4ybdgeYHq7SiTvKSnZx3H9U9QFk= X-Virus-Scanned: amavisd-new at dektech.com.au Received: from mail.dektech.com.au ([127.0.0.1]) by localhost (mail2.dektech.com.au [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id a0d9bnXG-PRl; Tue, 19 Feb 2019 18:03:19 +1100 (AEDT) Received: from mail.dektech.com.au (localhost [127.0.0.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.dektech.com.au (Postfix) with ESMTPS id 77CFCF923F; Tue, 19 Feb 2019 18:03:19 +1100 (AEDT) Received: from tung-VirtualBox.dek-tpc.internal (unknown [14.161.14.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.dektech.com.au (Postfix) with ESMTPSA id 65BDDF9243; Tue, 19 Feb 2019 18:03:18 +1100 (AEDT) From: Tung Nguyen To: davem@davemloft.net, netdev@vger.kernel.org Cc: tipc-discussion@lists.sourceforge.net Subject: [tipc-discussion][net 1/1] tipc: fix race condition causing hung sendto Date: Tue, 19 Feb 2019 14:03:10 +0700 Message-Id: <20190219070310.23888-2-tung.q.nguyen@dektech.com.au> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190219070310.23888-1-tung.q.nguyen@dektech.com.au> References: <20190219070310.23888-1-tung.q.nguyen@dektech.com.au> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org When sending multicast messages via blocking socket, if sending link is congested (tsk->cong_link_cnt is set to 1), the sending thread will be put into sleeping state. However, tipc_sk_filter_rcv() is called under socket spin lock but tipc_wait_for_cond() is not. So, there is no guarantee that the setting of tsk->cong_link_cnt to 0 in tipc_sk_proto_rcv() in CPU-1 will be perceived by CPU-0. If that is the case, the sending thread in CPU-0 after being waken up, will continue to see tsk->cong_link_cnt as 1 and put the sending thread into sleeping state again. The sending thread will sleep forever. CPU-0 | CPU-1 tipc_wait_for_cond() | { | // condition_ = !tsk->cong_link_cnt | while ((rc_ = !(condition_))) { | ... | release_sock(sk_); | wait_woken(); | | if (!sock_owned_by_user(sk)) | tipc_sk_filter_rcv() | { | ... | tipc_sk_proto_rcv() | { | ... | tsk->cong_link_cnt--; | ... | sk->sk_write_space(sk); | ... | } | ... | } sched_annotate_sleep(); | lock_sock(sk_); | remove_wait_queue(); | } | } | This commit fixes it by adding memory barrier to tipc_sk_proto_rcv() and tipc_wait_for_cond(). Acked-by: Jon Maloy Signed-off-by: Tung Nguyen --- net/tipc/socket.c | 4 ++++ 1 file changed, 4 insertions(+) mode change 100644 => 100755 net/tipc/socket.c diff --git a/net/tipc/socket.c b/net/tipc/socket.c old mode 100644 new mode 100755 index 1217c90a363b..d8f054d45941 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -383,6 +383,8 @@ static int tipc_sk_sock_err(struct socket *sock, long *timeout) int rc_; \ \ while ((rc_ = !(condition_))) { \ + /* coupled with smp_wmb() in tipc_sk_proto_rcv() */ \ + smp_rmb(); \ DEFINE_WAIT_FUNC(wait_, woken_wake_function); \ sk_ = (sock_)->sk; \ rc_ = tipc_sk_sock_err((sock_), timeo_); \ @@ -1982,6 +1984,8 @@ static void tipc_sk_proto_rcv(struct sock *sk, return; case SOCK_WAKEUP: tipc_dest_del(&tsk->cong_links, msg_orignode(hdr), 0); + /* coupled with smp_rmb() in tipc_wait_for_cond() */ + smp_wmb(); tsk->cong_link_cnt--; wakeup = true; break; -- 2.17.1