From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.5 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33DE8C55179 for ; Tue, 27 Oct 2020 12:05:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CAA0522264 for ; Tue, 27 Oct 2020 12:05:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=cloud.ionos.com header.i=@cloud.ionos.com header.b="c9cGaErA" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750633AbgJ0MFZ (ORCPT ); Tue, 27 Oct 2020 08:05:25 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:41718 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2439079AbgJ0MFZ (ORCPT ); Tue, 27 Oct 2020 08:05:25 -0400 Received: by mail-pf1-f194.google.com with SMTP id c20so787028pfr.8 for ; Tue, 27 Oct 2020 05:05:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.ionos.com; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=jh7sDSXXWr1G/EJt2NwYBs7gBq8ZBGiSTik+UKT8RDo=; b=c9cGaErAJceMBjUwS5KyqXKvvC1/ex4YeMQZg5pytCzXzMqDllWFagBhopX+/4PuLh yHDRL9G8Nw9C0PJ/ImPGiRSr05T5xxLlfbIyvk6/FdwVJ7nD/D6Cr29i3VsSiSHBPoLe /c3DbGIz9VsgxW5e4Axl58EPREAMLv44wrMyjYRI4fv6we9JNpzql6ulGVcNsbes8VuH SXAx5L/TI4QCP64g8LLY2DTkOeH4Yajd4NAuxhqA/myXH8aumAnO0j64av0yMBjwL4vP 6sQGQ0vVyyonX2dW4/JOB/AYqijhTIY/dawztd6b1jHBGjoSkMzgDE7kkmNE6kYHEgIL nr3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=jh7sDSXXWr1G/EJt2NwYBs7gBq8ZBGiSTik+UKT8RDo=; b=uZSekziEVkstbgI8UREqi0zIWR+SwkHMDwLHCg98J9mMX5fuwnKWPysbLul7II3wIh +yOaTyPhVC3k4QkvM9RbQIIaUSnck0UXq3n0vOMXTXTWFJttC4rRVlKlMchhYzjZOlwl qfs3pyCkL4erryMolEsMtBYXkr/7lMHWqARxBT+pivDIPFO46v6egnSkqVL15vEPflwX s7MFRRscrWR4DrGkSrMcvkFbzoO+vtTsgexJ6st1eUIpVFqSR2ZQDCIIxkFiH5EDamBU hLgd060XTnesUYV9DXMzPqTuijCSRR58yhZmPBdCpITgeczaedQwhBmGqoeVvHbWzM+Q P4VQ== X-Gm-Message-State: AOAM533dHBZ66i1TCZmh7fz0yKlr3hJ7XdDoLLw0GglR28kKFi9euqmD 3NuwhL6iyC+LXqA102FWP3enSg== X-Google-Smtp-Source: ABdhPJzt3qVz9rP8KwZ0eK2QmO6Zin+JfYz3eJD4f6acxIROeLz+sjmqqWUlU+ROPKcys6yt1Awi+Q== X-Received: by 2002:a62:343:0:b029:15c:e33c:faff with SMTP id 64-20020a6203430000b029015ce33cfaffmr1227137pfd.7.1603800321422; Tue, 27 Oct 2020 05:05:21 -0700 (PDT) Received: from ?IPv6:240e:82:3:96c8:9dbf:9753:3203:b67b? ([240e:82:3:96c8:9dbf:9753:3203:b67b]) by smtp.gmail.com with ESMTPSA id ms10sm1892683pjb.46.2020.10.27.05.05.10 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 27 Oct 2020 05:05:20 -0700 (PDT) Subject: Re: [PATCH] RDMA: Add rdma_connect_locked() To: Jason Gunthorpe , Danil Kipnis , Doug Ledford , Christoph Hellwig , Jack Wang , Keith Busch , linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org, Max Gurtovoy , netdev@vger.kernel.org, rds-devel@oss.oracle.com, Sagi Grimberg , Santosh Shilimkar Cc: Leon Romanovsky References: <0-v1-75e124dbad74+b05-rdma_connect_locking_jgg@nvidia.com> From: Guoqing Jiang Message-ID: <11bb18bd-a26a-d0e2-9ff6-6d7e2bf3fb86@cloud.ionos.com> Date: Tue, 27 Oct 2020 13:05:00 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <0-v1-75e124dbad74+b05-rdma_connect_locking_jgg@nvidia.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org On 10/26/20 15:25, Jason Gunthorpe wrote: > There are two flows for handling RDMA_CM_EVENT_ROUTE_RESOLVED, either the > handler triggers a completion and another thread does rdma_connect() or > the handler directly calls rdma_connect(). > > In all cases rdma_connect() needs to hold the handler_mutex, but when > handler's are invoked this is already held by the core code. This causes > ULPs using the 2nd method to deadlock. > > Provide a rdma_connect_locked() and have all ULPs call it from their > handlers. > > Reported-by: Guoqing Jiang > Fixes: 2a7cec538169 ("RDMA/cma: Fix locking for the RDMA_CM_CONNECT state" > Signed-off-by: Jason Gunthorpe > --- > drivers/infiniband/core/cma.c | 39 +++++++++++++++++++++--- > drivers/infiniband/ulp/iser/iser_verbs.c | 2 +- > drivers/infiniband/ulp/rtrs/rtrs-clt.c | 4 +-- > drivers/nvme/host/rdma.c | 10 +++--- > include/rdma/rdma_cm.h | 13 +------- > net/rds/ib_cm.c | 5 +-- > 6 files changed, 47 insertions(+), 26 deletions(-) > > Seems people are not testing these four ULPs against rdma-next.. Here is a > quick fix for the issue: > > https://lore.kernel.org/r/3b1f7767-98e2-93e0-b718-16d1c5346140@cloud.ionos.com I can't see the previous calltrace with this patch. Tested-by: Guoqing Jiang Thanks, Guoqing From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E8FFC388F9 for ; Tue, 27 Oct 2020 12:05:33 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 00D9122264 for ; Tue, 27 Oct 2020 12:05:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="RFaAU1sY"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=cloud.ionos.com header.i=@cloud.ionos.com header.b="c9cGaErA" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 00D9122264 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cloud.ionos.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Type: Content-Transfer-Encoding:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=ixeVXq05nzVVQT8nvMsNGeVvz2WNPCtfYbTgR1BcD3g=; b=RFaAU1sYH2FmhcEkWfFWPszBH B7OamNOpGkI4+FDvL1e7D6qKPvpOAIJiSJecoc9823BNrgF/9j2I3Fc/basy8VK1aw5nrr5+Hq12z n9/z2XDH9RXTLpeUgsqS/AhQlULguPA43J9g4XkNpHBpwTYc67OAvZbcNhV+yObMnH/d8/ujLX09w VUIfAHCjnfYoCN2KTgLMtE07Med2LpmCYWMT5BSSTZS5DZTBs8n6c9Gr+4qKiJu9xHeNbCjaiafA5 Zj/7Rgr0fBe+XIHimK8IPaX3BGu3desWqbjkEMfY5cKty9JV6LPG5CGwaZpdGwPrPnfrnCZEqsCBm u2zrkJ/zA==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kXNjF-0003pe-P9; Tue, 27 Oct 2020 12:05:29 +0000 Received: from mail-pf1-x443.google.com ([2607:f8b0:4864:20::443]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kXNjC-0003od-P6 for linux-nvme@lists.infradead.org; Tue, 27 Oct 2020 12:05:27 +0000 Received: by mail-pf1-x443.google.com with SMTP id e7so777932pfn.12 for ; Tue, 27 Oct 2020 05:05:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.ionos.com; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=jh7sDSXXWr1G/EJt2NwYBs7gBq8ZBGiSTik+UKT8RDo=; b=c9cGaErAJceMBjUwS5KyqXKvvC1/ex4YeMQZg5pytCzXzMqDllWFagBhopX+/4PuLh yHDRL9G8Nw9C0PJ/ImPGiRSr05T5xxLlfbIyvk6/FdwVJ7nD/D6Cr29i3VsSiSHBPoLe /c3DbGIz9VsgxW5e4Axl58EPREAMLv44wrMyjYRI4fv6we9JNpzql6ulGVcNsbes8VuH SXAx5L/TI4QCP64g8LLY2DTkOeH4Yajd4NAuxhqA/myXH8aumAnO0j64av0yMBjwL4vP 6sQGQ0vVyyonX2dW4/JOB/AYqijhTIY/dawztd6b1jHBGjoSkMzgDE7kkmNE6kYHEgIL nr3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=jh7sDSXXWr1G/EJt2NwYBs7gBq8ZBGiSTik+UKT8RDo=; b=lRUlo7fERCM3Rsp4WrS2k9W60IDQfC0gtVXU7fHWwy4bKxuFp9fNmb1R6IYqPZhGIE IgSzybuS0M9lrjHURpSRS9YRYfk2CwDMWmwz4vuunvUD9smod92qqxV5GozSIPILdT0O 5hvZn4eBmLBHhXUhNbL7mthAJ5IYn4xcpN4zSyC5SBeuVLq36R1R7ydD3Nwigs87Yn1R LtXWDCCoGNiG3CubYnmFA7BiPtsUkpK/tO7hZuj5+MRNk85KpjPvgbm9Q25GQ8h1+1s9 oeGIRzkAPQF4A9OLKcpfeFd4RVlsnCLPkA40pK63EJffPrvdue8oMo3Wp0iSzTuk5zU1 9FWw== X-Gm-Message-State: AOAM532VWHeQkrtmeTuDJUIGrv1r8+TIdtqt+aIiKtzlSgLjb/WfQz1I 0PRP4a3ot3aOng9fq4O70/8yDA== X-Google-Smtp-Source: ABdhPJzt3qVz9rP8KwZ0eK2QmO6Zin+JfYz3eJD4f6acxIROeLz+sjmqqWUlU+ROPKcys6yt1Awi+Q== X-Received: by 2002:a62:343:0:b029:15c:e33c:faff with SMTP id 64-20020a6203430000b029015ce33cfaffmr1227137pfd.7.1603800321422; Tue, 27 Oct 2020 05:05:21 -0700 (PDT) Received: from ?IPv6:240e:82:3:96c8:9dbf:9753:3203:b67b? ([240e:82:3:96c8:9dbf:9753:3203:b67b]) by smtp.gmail.com with ESMTPSA id ms10sm1892683pjb.46.2020.10.27.05.05.10 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 27 Oct 2020 05:05:20 -0700 (PDT) Subject: Re: [PATCH] RDMA: Add rdma_connect_locked() To: Jason Gunthorpe , Danil Kipnis , Doug Ledford , Christoph Hellwig , Jack Wang , Keith Busch , linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org, Max Gurtovoy , netdev@vger.kernel.org, rds-devel@oss.oracle.com, Sagi Grimberg , Santosh Shilimkar References: <0-v1-75e124dbad74+b05-rdma_connect_locking_jgg@nvidia.com> From: Guoqing Jiang Message-ID: <11bb18bd-a26a-d0e2-9ff6-6d7e2bf3fb86@cloud.ionos.com> Date: Tue, 27 Oct 2020 13:05:00 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <0-v1-75e124dbad74+b05-rdma_connect_locking_jgg@nvidia.com> Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201027_080526_844654_5A5EF38F X-CRM114-Status: GOOD ( 19.10 ) X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leon Romanovsky Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "Linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org On 10/26/20 15:25, Jason Gunthorpe wrote: > There are two flows for handling RDMA_CM_EVENT_ROUTE_RESOLVED, either the > handler triggers a completion and another thread does rdma_connect() or > the handler directly calls rdma_connect(). > > In all cases rdma_connect() needs to hold the handler_mutex, but when > handler's are invoked this is already held by the core code. This causes > ULPs using the 2nd method to deadlock. > > Provide a rdma_connect_locked() and have all ULPs call it from their > handlers. > > Reported-by: Guoqing Jiang > Fixes: 2a7cec538169 ("RDMA/cma: Fix locking for the RDMA_CM_CONNECT state" > Signed-off-by: Jason Gunthorpe > --- > drivers/infiniband/core/cma.c | 39 +++++++++++++++++++++--- > drivers/infiniband/ulp/iser/iser_verbs.c | 2 +- > drivers/infiniband/ulp/rtrs/rtrs-clt.c | 4 +-- > drivers/nvme/host/rdma.c | 10 +++--- > include/rdma/rdma_cm.h | 13 +------- > net/rds/ib_cm.c | 5 +-- > 6 files changed, 47 insertions(+), 26 deletions(-) > > Seems people are not testing these four ULPs against rdma-next.. Here is a > quick fix for the issue: > > https://lore.kernel.org/r/3b1f7767-98e2-93e0-b718-16d1c5346140@cloud.ionos.com I can't see the previous calltrace with this patch. Tested-by: Guoqing Jiang Thanks, Guoqing _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme