From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA4E5C43381 for ; Mon, 25 Feb 2019 03:57:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 08B082084D for ; Mon, 25 Feb 2019 03:57:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=dektech.com.au header.i=@dektech.com.au header.b="ZCpzA0BE" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728616AbfBYD5d (ORCPT ); Sun, 24 Feb 2019 22:57:33 -0500 Received: from f0-dek.dektech.com.au ([210.10.221.142]:33659 "EHLO mail.dektech.com.au" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728527AbfBYD5d (ORCPT ); Sun, 24 Feb 2019 22:57:33 -0500 Received: from localhost (localhost [127.0.0.1]) by mail.dektech.com.au (Postfix) with ESMTP id 0A635F9482; Mon, 25 Feb 2019 14:57:29 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=dektech.com.au; h=references:in-reply-to:x-mailer:message-id:date:date:subject :subject:from:from:received:received:received; s=mail_dkim; t= 1551067048; bh=ty0VMELf1drrukQ1CJ4wLRSrmWPn1Cc5rHFYo29vUdM=; b=Z CpzA0BErWRvoAQukjNA/0QqxFg7PdVjwZdqzvLBNZXifMCD9WgyrHzajVEJddkJ6 HRxRPBSf4QH5UN/lLEtvAqWwVbnR6CBTvkyNiVy8ndyRmQ/cIbeO2IUpYLOQzRPZ BN7dISkwEBDiZEVC2g2LIx4OVR3jT3CAgXJ8oS3svQ= X-Virus-Scanned: amavisd-new at dektech.com.au Received: from mail.dektech.com.au ([127.0.0.1]) by localhost (mail2.dektech.com.au [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id raPY233xaTxO; Mon, 25 Feb 2019 14:57:28 +1100 (AEDT) Received: from mail.dektech.com.au (localhost [127.0.0.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.dektech.com.au (Postfix) with ESMTPS id CD348F948A; Mon, 25 Feb 2019 14:57:28 +1100 (AEDT) Received: from tung-VirtualBox.dek-tpc.internal (unknown [14.161.14.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.dektech.com.au (Postfix) with ESMTPSA id B05DCF9482; Mon, 25 Feb 2019 14:57:27 +1100 (AEDT) From: Tung Nguyen To: davem@davemloft.net, netdev@vger.kernel.org Cc: tipc-discussion@lists.sourceforge.net Subject: [net v3 1/1] tipc: fix race condition causing hung sendto Date: Mon, 25 Feb 2019 10:57:20 +0700 Message-Id: <20190225035720.5175-1-tung.q.nguyen@dektech.com.au> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190221033121.4220-1-tung.q.nguyen@dektech.com.au> References: <20190221033121.4220-1-tung.q.nguyen@dektech.com.au> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org When sending multicast messages via blocking socket, if sending link is congested (tsk->cong_link_cnt is set to 1), the sending thread will be put into sleeping state. However, tipc_sk_filter_rcv() is called under socket spin lock but tipc_wait_for_cond() is not. So, there is no guarantee that the setting of tsk->cong_link_cnt to 0 in tipc_sk_proto_rcv() in CPU-1 will be perceived by CPU-0. If that is the case, the sending thread in CPU-0 after being waken up, will continue to see tsk->cong_link_cnt as 1 and put the sending thread into sleeping state again. The sending thread will sleep forever. CPU-0 | CPU-1 tipc_wait_for_cond() | { | // condition_ = !tsk->cong_link_cnt | while ((rc_ = !(condition_))) { | ... | release_sock(sk_); | wait_woken(); | | if (!sock_owned_by_user(sk)) | tipc_sk_filter_rcv() | { | ... | tipc_sk_proto_rcv() | { | ... | tsk->cong_link_cnt--; | ... | sk->sk_write_space(sk); | ... | } | ... | } sched_annotate_sleep(); | lock_sock(sk_); | remove_wait_queue(); | } | } | This commit fixes it by adding memory barrier to tipc_sk_proto_rcv() and tipc_wait_for_cond(). Acked-by: Jon Maloy Signed-off-by: Tung Nguyen --- net/tipc/socket.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/net/tipc/socket.c b/net/tipc/socket.c index 684f2125fc6b..70343ac448b1 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -379,11 +379,13 @@ static int tipc_sk_sock_err(struct socket *sock, long *timeout) #define tipc_wait_for_cond(sock_, timeo_, condition_) \ ({ \ + DEFINE_WAIT_FUNC(wait_, woken_wake_function); \ struct sock *sk_; \ int rc_; \ \ while ((rc_ = !(condition_))) { \ - DEFINE_WAIT_FUNC(wait_, woken_wake_function); \ + /* coupled with smp_wmb() in tipc_sk_proto_rcv() */ \ + smp_rmb(); \ sk_ = (sock_)->sk; \ rc_ = tipc_sk_sock_err((sock_), timeo_); \ if (rc_) \ @@ -1983,6 +1985,8 @@ static void tipc_sk_proto_rcv(struct sock *sk, return; case SOCK_WAKEUP: tipc_dest_del(&tsk->cong_links, msg_orignode(hdr), 0); + /* coupled with smp_rmb() in tipc_wait_for_cond() */ + smp_wmb(); tsk->cong_link_cnt--; wakeup = true; break; -- 2.17.1