From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 708DDC433B4 for ; Tue, 13 Apr 2021 07:13:04 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3315960FDB for ; Tue, 13 Apr 2021 07:13:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3315960FDB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from list by lists.xenproject.org with outflank-mailman.109548.209102 (Exim 4.92) (envelope-from ) id 1lWDE6-00074K-Pg; Tue, 13 Apr 2021 07:12:46 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 109548.209102; Tue, 13 Apr 2021 07:12:46 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1lWDE6-00074D-Mh; Tue, 13 Apr 2021 07:12:46 +0000 Received: by outflank-mailman (input) for mailman id 109548; Tue, 13 Apr 2021 07:12:45 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1lWDE4-000748-TJ for xen-devel@lists.xenproject.org; Tue, 13 Apr 2021 07:12:44 +0000 Received: from mail-wr1-x431.google.com (unknown [2a00:1450:4864:20::431]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 6ab5835b-3399-456b-8633-ad8ffba81e7e; Tue, 13 Apr 2021 07:12:43 +0000 (UTC) Received: by mail-wr1-x431.google.com with SMTP id a6so15318024wrw.8 for ; Tue, 13 Apr 2021 00:12:43 -0700 (PDT) Received: from [192.168.1.186] (host86-180-176-157.range86-180.btcentralplus.com. [86.180.176.157]) by smtp.gmail.com with ESMTPSA id a7sm20623746wrn.50.2021.04.13.00.12.41 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 13 Apr 2021 00:12:42 -0700 (PDT) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 6ab5835b-3399-456b-8633-ad8ffba81e7e DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:reply-to:subject:to:references:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=kZ7pRbfksbAsRyZaqanN7SMBLUMG4+C95SI0k9/0TBw=; b=Fk3f9q0kuQ6ar8P0an7whR7d7CldfWPQUWVT5Pb0qt2vTOprynBeUiIpwokoiRp7Y8 vYqAud1UCSMRCEw4R9V/SIXgFXFO1F4QZCJh59Nxc1xwfm2ijx9QihE9qxZio4Xf+vg9 zMA4k7GyDv4icl40xVo8XhN1glIKR9JPMZJvtNdflDOC/OxgaOApG3a+PdHnQy7NM5kJ CUM9wPL4xMSL9ZL2pFFrlshnvVFeRFrvvfwFzk/s2YBEd5Oe9l4JUzvw3QIHodpMc5hB 8gk2fazC+ICPxE+uXJ8XKymYS8YH2YrOIL6gS7+tcScCI+MgbdSQ+wJWh0tXZR2DdaHl WK0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:reply-to:subject:to:references:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=kZ7pRbfksbAsRyZaqanN7SMBLUMG4+C95SI0k9/0TBw=; b=RuLIWRgeE7XlZ7AN6izkjoQ81TYJHSi3CR4Iscd9xE0FuIfffayAechpSTtLf4EosX N9Vg9TMj1DylWpunZroViZ6ZN746C1KDh85q84tfldQ314shGdSZq6E37BLAE2LPzH99 D9OXzNCwsr5traUzZveyoJmGNE8MQxL9EON3ab61MjL82sCoj0duFm5cAnbZM8wdfq6D 4mXxtuDhOUFTw3LcF7fqzvJFmhPTjmjFGwDhKKkR7UHnnOK+C4s8Uk3pIR13EvSGr3jJ yKMMLVFnaScYH+G+6Ys46/LG6e0xm6UxkEIqINunaPG2hodnE9/u72bZ/vV9+gVc0vQT wfZA== X-Gm-Message-State: AOAM533GCve332J4lrrwoTrY3lLYshD0jSf9D+zbEsOn7IszJCwMKZY+ wo758MIzRUjqYNj02SuzPCc= X-Google-Smtp-Source: ABdhPJwSF5KW1F6WT03735VShUEOmGDd3ZvOwIvdQSdaEjd5CGeJmnPMc4IrSX8WiQz5NmgQ/2EU9Q== X-Received: by 2002:a5d:65d2:: with SMTP id e18mr34699836wrw.256.1618297962912; Tue, 13 Apr 2021 00:12:42 -0700 (PDT) From: Paul Durrant X-Google-Original-From: Paul Durrant Reply-To: paul@xen.org Subject: Re: xen-netback hotplug-status regression bug To: Michael Brown , Wei Liu , xen-devel@lists.xenproject.org, netdev@vger.kernel.org, Paul Durrant References: Message-ID: Date: Tue, 13 Apr 2021 08:12:44 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit On 10/04/2021 19:25, Michael Brown wrote: > Commit https://github.com/torvalds/linux/commit/1f25657 ("xen-netback: > remove 'hotplug-status' once it has served its purpose") seems to have > introduced a regression that prevents a vif frontend from transitioning > more than once into Connected state. > > As far as I can tell: > > - The defined vif script (e.g. /etc/xen/scripts/vif-bridge) executes > only once, at domU startup, and sets > backend/vif//0/hotplug-status="connected" > > - When the frontend first enters Connected state, > drivers/net/xen-netback/xenbus.c's connect() sets up a watch on > "hotplug-status" with the callback function hotplug_status_changed() > > - When hotplug_status_changed() is triggered by the watch, it > transitions the backend to Connected state and calls xenbus_rm() to > delete the "hotplug-status" attribute. > > If the frontend subsequently disconnects and reconnects (e.g. > transitions through Closed->Initialising->Connected) then: > > - Nothing recreates "hotplug-status" > > - When the frontend re-enters Connected state, connect() sets up a watch > on "hotplug-status" again > > - The callback hotplug_status_changed() is never triggered, and so the > backend device never transitions to Connected state. > That's not how I read it. Given that "hotplug-status" is removed by the call to hotplug_status_changed() then the next call to connect() should fail to register the watch and 'have_hotplug_status_watch' should be 0. Thus backend_switch_state() should not defer the transition to XenbusStateConnected in any subsequent interaction with the frontend. > > Reverting the commit would fix this bug, but would obviously also > reintroduce the race condition that the commit was designed to avoid. > > I'm happy to put together a patch, if one of the maintainers could > suggest a sensible design approach. > Are you seeing the watch successfully re-registered even though the node does not exist? Perhaps there has been a change in xenstore behaviour? Paul > I'm not a list member, so please CC me directly on replies. > > Thanks, > > Michael