From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1301FC433E3 for ; Fri, 21 Aug 2020 22:29:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DFEA520738 for ; Fri, 21 Aug 2020 22:29:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="BLcenG1f" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727807AbgHUW3e (ORCPT ); Fri, 21 Aug 2020 18:29:34 -0400 Received: from smtp-fw-6001.amazon.com ([52.95.48.154]:13171 "EHLO smtp-fw-6001.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726773AbgHUW3e (ORCPT ); Fri, 21 Aug 2020 18:29:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1598048972; x=1629584972; h=date:from:to:subject:message-id:references:mime-version: in-reply-to; bh=0l+CXtX6lcZ9WB2ya1Fw3XHogwe7aFcEY8gTvIE6+gU=; b=BLcenG1fHzerVehod+oxsFTwyuS+9v1c8mYhZ3ojDMCGLUfM+UOnlsPK f5+yjvbrY7S3CN/fNvURZpsRQfuxwrKYpvTsEn8RVI9u8HLNaH7slmhgB AG08WZApOlCjx0aX3vJepoA2Djc6X5jsbU38fVGN+GAm5nGu+j9LWZpXo c=; X-IronPort-AV: E=Sophos;i="5.76,338,1592870400"; d="scan'208";a="50742767" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-2c-cc689b93.us-west-2.amazon.com) ([10.43.8.6]) by smtp-border-fw-out-6001.iad6.amazon.com with ESMTP; 21 Aug 2020 22:29:29 +0000 Received: from EX13MTAUWA001.ant.amazon.com (pdx4-ws-svc-p6-lb7-vlan3.pdx.amazon.com [10.170.41.166]) by email-inbound-relay-2c-cc689b93.us-west-2.amazon.com (Postfix) with ESMTPS id C9F43120F51; Fri, 21 Aug 2020 22:29:22 +0000 (UTC) Received: from EX13D01UWA001.ant.amazon.com (10.43.160.60) by EX13MTAUWA001.ant.amazon.com (10.43.160.58) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 21 Aug 2020 22:29:15 +0000 Received: from EX13MTAUWA001.ant.amazon.com (10.43.160.58) by EX13d01UWA001.ant.amazon.com (10.43.160.60) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 21 Aug 2020 22:29:15 +0000 Received: from dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com (172.22.96.68) by mail-relay.amazon.com (10.43.160.118) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Fri, 21 Aug 2020 22:29:15 +0000 Received: by dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com (Postfix, from userid 4335130) id AEB9140362; Fri, 21 Aug 2020 22:29:15 +0000 (UTC) Date: Fri, 21 Aug 2020 22:29:15 +0000 From: Anchal Agarwal To: , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH v3 07/11] xen-netfront: add callbacks for PM suspend and hibernation Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Munehisa Kamata Add freeze, thaw and restore callbacks for PM suspend and hibernation support. The freeze handler simply disconnects the frotnend from the backend and frees resources associated with queues after disabling the net_device from the system. The restore handler just changes the frontend state and let the xenbus handler to re-allocate the resources and re-connect to the backend. This can be performed transparently to the rest of the system. The handlers are used for both PM suspend and hibernation so that we can keep the existing suspend/resume callbacks for Xen suspend without modification. Freezing netfront devices is normally expected to finish within a few hundred milliseconds, but it can rarely take more than 5 seconds and hit the hard coded timeout, it would depend on backend state which may be congested and/or have complex configuration. While it's rare case, longer default timeout seems a bit more reasonable here to avoid hitting the timeout. Also, make it configurable via module parameter so that we can cover broader setups than what we know currently. [Anchal Agarwal: Changelog]: RFCv1->RFCv2: Variable name fix and checkpatch.pl fixes] v2->v3: Resolved merge conflicts Signed-off-by: Anchal Agarwal Signed-off-by: Munehisa Kamata --- drivers/net/xen-netfront.c | 96 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 95 insertions(+), 1 deletion(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index 458be6882b98..3ea3ecc6e0d3 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -47,6 +47,7 @@ #include #include #include +#include #include #include @@ -59,6 +60,12 @@ #include #include +enum netif_freeze_state { + NETIF_FREEZE_STATE_UNFROZEN, + NETIF_FREEZE_STATE_FREEZING, + NETIF_FREEZE_STATE_FROZEN, +}; + /* Module parameters */ #define MAX_QUEUES_DEFAULT 8 static unsigned int xennet_max_queues; @@ -68,6 +75,12 @@ MODULE_PARM_DESC(max_queues, #define XENNET_TIMEOUT (5 * HZ) +static unsigned int netfront_freeze_timeout_secs = 10; +module_param_named(freeze_timeout_secs, + netfront_freeze_timeout_secs, uint, 0644); +MODULE_PARM_DESC(freeze_timeout_secs, + "timeout when freezing netfront device in seconds"); + static const struct ethtool_ops xennet_ethtool_ops; struct netfront_cb { @@ -174,6 +187,9 @@ struct netfront_info { bool netfront_xdp_enabled; atomic_t rx_gso_checksum_fixup; + + int freeze_state; + struct completion wait_backend_disconnected; }; struct netfront_rx_info { @@ -798,6 +814,21 @@ static int xennet_close(struct net_device *dev) return 0; } +static int xennet_disable_interrupts(struct net_device *dev) +{ + struct netfront_info *np = netdev_priv(dev); + unsigned int num_queues = dev->real_num_tx_queues; + unsigned int queue_index; + struct netfront_queue *queue; + + for (queue_index = 0; queue_index < num_queues; ++queue_index) { + queue = &np->queues[queue_index]; + disable_irq(queue->tx_irq); + disable_irq(queue->rx_irq); + } + return 0; +} + static void xennet_move_rx_slot(struct netfront_queue *queue, struct sk_buff *skb, grant_ref_t ref) { @@ -1532,6 +1563,8 @@ static struct net_device *xennet_create_dev(struct xenbus_device *dev) np->queues = NULL; + init_completion(&np->wait_backend_disconnected); + err = -ENOMEM; np->rx_stats = netdev_alloc_pcpu_stats(struct netfront_stats); if (np->rx_stats == NULL) @@ -2084,6 +2117,50 @@ static int xennet_create_queues(struct netfront_info *info, return 0; } +static int netfront_freeze(struct xenbus_device *dev) +{ + struct netfront_info *info = dev_get_drvdata(&dev->dev); + unsigned long timeout = netfront_freeze_timeout_secs * HZ; + int err = 0; + + xennet_disable_interrupts(info->netdev); + + netif_device_detach(info->netdev); + + info->freeze_state = NETIF_FREEZE_STATE_FREEZING; + + /* Kick the backend to disconnect */ + xenbus_switch_state(dev, XenbusStateClosing); + + /* We don't want to move forward before the frontend is diconnected + * from the backend cleanly. + */ + timeout = wait_for_completion_timeout(&info->wait_backend_disconnected, + timeout); + if (!timeout) { + err = -EBUSY; + xenbus_dev_error(dev, err, "Freezing timed out;" + "the device may become inconsistent state"); + return err; + } + + /* Tear down queues */ + xennet_disconnect_backend(info); + xennet_destroy_queues(info); + + info->freeze_state = NETIF_FREEZE_STATE_FROZEN; + + return err; +} + +static int netfront_restore(struct xenbus_device *dev) +{ + /* Kick the backend to re-connect */ + xenbus_switch_state(dev, XenbusStateInitialising); + + return 0; +} + /* Common code used when first setting up, and when resuming. */ static int talk_to_netback(struct xenbus_device *dev, struct netfront_info *info) @@ -2302,6 +2379,8 @@ static int xennet_connect(struct net_device *dev) spin_unlock_bh(&queue->rx_lock); } + np->freeze_state = NETIF_FREEZE_STATE_UNFROZEN; + return 0; } @@ -2339,10 +2418,22 @@ static void netback_changed(struct xenbus_device *dev, break; case XenbusStateClosed: - if (dev->state == XenbusStateClosed) + if (dev->state == XenbusStateClosed) { + /* dpm context is waiting for the backend */ + if (np->freeze_state == NETIF_FREEZE_STATE_FREEZING) + complete(&np->wait_backend_disconnected); break; + } /* Fall through - Missed the backend's CLOSING state. */ case XenbusStateClosing: + /* We may see unexpected Closed or Closing from the backend. + * Just ignore it not to prevent the frontend from being + * re-connected in the case of PM suspend or hibernation. + */ + if (np->freeze_state == NETIF_FREEZE_STATE_FROZEN && + dev->state == XenbusStateInitialising) { + break; + } xenbus_frontend_closed(dev); break; } @@ -2505,6 +2596,9 @@ static struct xenbus_driver netfront_driver = { .probe = netfront_probe, .remove = xennet_remove, .resume = netfront_resume, + .freeze = netfront_freeze, + .thaw = netfront_restore, + .restore = netfront_restore, .otherend_changed = netback_changed, }; -- 2.16.6 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79699C433DF for ; Fri, 21 Aug 2020 22:29:45 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4C5622076E for ; Fri, 21 Aug 2020 22:29:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="BLcenG1f" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4C5622076E Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=amazon.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1k9FXR-0004nX-VX; Fri, 21 Aug 2020 22:29:33 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1k9FXQ-0004ml-PD for xen-devel@lists.xenproject.org; Fri, 21 Aug 2020 22:29:32 +0000 X-Inumbo-ID: 3185c895-8779-49ad-b517-da8613a37690 Received: from smtp-fw-6001.amazon.com (unknown [52.95.48.154]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 3185c895-8779-49ad-b517-da8613a37690; Fri, 21 Aug 2020 22:29:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1598048972; x=1629584972; h=date:from:to:subject:message-id:references:mime-version: in-reply-to; bh=0l+CXtX6lcZ9WB2ya1Fw3XHogwe7aFcEY8gTvIE6+gU=; b=BLcenG1fHzerVehod+oxsFTwyuS+9v1c8mYhZ3ojDMCGLUfM+UOnlsPK f5+yjvbrY7S3CN/fNvURZpsRQfuxwrKYpvTsEn8RVI9u8HLNaH7slmhgB AG08WZApOlCjx0aX3vJepoA2Djc6X5jsbU38fVGN+GAm5nGu+j9LWZpXo c=; X-IronPort-AV: E=Sophos;i="5.76,338,1592870400"; d="scan'208";a="50742767" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-2c-cc689b93.us-west-2.amazon.com) ([10.43.8.6]) by smtp-border-fw-out-6001.iad6.amazon.com with ESMTP; 21 Aug 2020 22:29:29 +0000 Received: from EX13MTAUWA001.ant.amazon.com (pdx4-ws-svc-p6-lb7-vlan3.pdx.amazon.com [10.170.41.166]) by email-inbound-relay-2c-cc689b93.us-west-2.amazon.com (Postfix) with ESMTPS id C9F43120F51; Fri, 21 Aug 2020 22:29:22 +0000 (UTC) Received: from EX13D01UWA001.ant.amazon.com (10.43.160.60) by EX13MTAUWA001.ant.amazon.com (10.43.160.58) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 21 Aug 2020 22:29:15 +0000 Received: from EX13MTAUWA001.ant.amazon.com (10.43.160.58) by EX13d01UWA001.ant.amazon.com (10.43.160.60) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 21 Aug 2020 22:29:15 +0000 Received: from dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com (172.22.96.68) by mail-relay.amazon.com (10.43.160.118) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Fri, 21 Aug 2020 22:29:15 +0000 Received: by dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com (Postfix, from userid 4335130) id AEB9140362; Fri, 21 Aug 2020 22:29:15 +0000 (UTC) Date: Fri, 21 Aug 2020 22:29:15 +0000 From: Anchal Agarwal To: , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH v3 07/11] xen-netfront: add callbacks for PM suspend and hibernation Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Precedence: Bulk X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.29 List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" From: Munehisa Kamata Add freeze, thaw and restore callbacks for PM suspend and hibernation support. The freeze handler simply disconnects the frotnend from the backend and frees resources associated with queues after disabling the net_device from the system. The restore handler just changes the frontend state and let the xenbus handler to re-allocate the resources and re-connect to the backend. This can be performed transparently to the rest of the system. The handlers are used for both PM suspend and hibernation so that we can keep the existing suspend/resume callbacks for Xen suspend without modification. Freezing netfront devices is normally expected to finish within a few hundred milliseconds, but it can rarely take more than 5 seconds and hit the hard coded timeout, it would depend on backend state which may be congested and/or have complex configuration. While it's rare case, longer default timeout seems a bit more reasonable here to avoid hitting the timeout. Also, make it configurable via module parameter so that we can cover broader setups than what we know currently. [Anchal Agarwal: Changelog]: RFCv1->RFCv2: Variable name fix and checkpatch.pl fixes] v2->v3: Resolved merge conflicts Signed-off-by: Anchal Agarwal Signed-off-by: Munehisa Kamata --- drivers/net/xen-netfront.c | 96 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 95 insertions(+), 1 deletion(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index 458be6882b98..3ea3ecc6e0d3 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -47,6 +47,7 @@ #include #include #include +#include #include #include @@ -59,6 +60,12 @@ #include #include +enum netif_freeze_state { + NETIF_FREEZE_STATE_UNFROZEN, + NETIF_FREEZE_STATE_FREEZING, + NETIF_FREEZE_STATE_FROZEN, +}; + /* Module parameters */ #define MAX_QUEUES_DEFAULT 8 static unsigned int xennet_max_queues; @@ -68,6 +75,12 @@ MODULE_PARM_DESC(max_queues, #define XENNET_TIMEOUT (5 * HZ) +static unsigned int netfront_freeze_timeout_secs = 10; +module_param_named(freeze_timeout_secs, + netfront_freeze_timeout_secs, uint, 0644); +MODULE_PARM_DESC(freeze_timeout_secs, + "timeout when freezing netfront device in seconds"); + static const struct ethtool_ops xennet_ethtool_ops; struct netfront_cb { @@ -174,6 +187,9 @@ struct netfront_info { bool netfront_xdp_enabled; atomic_t rx_gso_checksum_fixup; + + int freeze_state; + struct completion wait_backend_disconnected; }; struct netfront_rx_info { @@ -798,6 +814,21 @@ static int xennet_close(struct net_device *dev) return 0; } +static int xennet_disable_interrupts(struct net_device *dev) +{ + struct netfront_info *np = netdev_priv(dev); + unsigned int num_queues = dev->real_num_tx_queues; + unsigned int queue_index; + struct netfront_queue *queue; + + for (queue_index = 0; queue_index < num_queues; ++queue_index) { + queue = &np->queues[queue_index]; + disable_irq(queue->tx_irq); + disable_irq(queue->rx_irq); + } + return 0; +} + static void xennet_move_rx_slot(struct netfront_queue *queue, struct sk_buff *skb, grant_ref_t ref) { @@ -1532,6 +1563,8 @@ static struct net_device *xennet_create_dev(struct xenbus_device *dev) np->queues = NULL; + init_completion(&np->wait_backend_disconnected); + err = -ENOMEM; np->rx_stats = netdev_alloc_pcpu_stats(struct netfront_stats); if (np->rx_stats == NULL) @@ -2084,6 +2117,50 @@ static int xennet_create_queues(struct netfront_info *info, return 0; } +static int netfront_freeze(struct xenbus_device *dev) +{ + struct netfront_info *info = dev_get_drvdata(&dev->dev); + unsigned long timeout = netfront_freeze_timeout_secs * HZ; + int err = 0; + + xennet_disable_interrupts(info->netdev); + + netif_device_detach(info->netdev); + + info->freeze_state = NETIF_FREEZE_STATE_FREEZING; + + /* Kick the backend to disconnect */ + xenbus_switch_state(dev, XenbusStateClosing); + + /* We don't want to move forward before the frontend is diconnected + * from the backend cleanly. + */ + timeout = wait_for_completion_timeout(&info->wait_backend_disconnected, + timeout); + if (!timeout) { + err = -EBUSY; + xenbus_dev_error(dev, err, "Freezing timed out;" + "the device may become inconsistent state"); + return err; + } + + /* Tear down queues */ + xennet_disconnect_backend(info); + xennet_destroy_queues(info); + + info->freeze_state = NETIF_FREEZE_STATE_FROZEN; + + return err; +} + +static int netfront_restore(struct xenbus_device *dev) +{ + /* Kick the backend to re-connect */ + xenbus_switch_state(dev, XenbusStateInitialising); + + return 0; +} + /* Common code used when first setting up, and when resuming. */ static int talk_to_netback(struct xenbus_device *dev, struct netfront_info *info) @@ -2302,6 +2379,8 @@ static int xennet_connect(struct net_device *dev) spin_unlock_bh(&queue->rx_lock); } + np->freeze_state = NETIF_FREEZE_STATE_UNFROZEN; + return 0; } @@ -2339,10 +2418,22 @@ static void netback_changed(struct xenbus_device *dev, break; case XenbusStateClosed: - if (dev->state == XenbusStateClosed) + if (dev->state == XenbusStateClosed) { + /* dpm context is waiting for the backend */ + if (np->freeze_state == NETIF_FREEZE_STATE_FREEZING) + complete(&np->wait_backend_disconnected); break; + } /* Fall through - Missed the backend's CLOSING state. */ case XenbusStateClosing: + /* We may see unexpected Closed or Closing from the backend. + * Just ignore it not to prevent the frontend from being + * re-connected in the case of PM suspend or hibernation. + */ + if (np->freeze_state == NETIF_FREEZE_STATE_FROZEN && + dev->state == XenbusStateInitialising) { + break; + } xenbus_frontend_closed(dev); break; } @@ -2505,6 +2596,9 @@ static struct xenbus_driver netfront_driver = { .probe = netfront_probe, .remove = xennet_remove, .resume = netfront_resume, + .freeze = netfront_freeze, + .thaw = netfront_restore, + .restore = netfront_restore, .otherend_changed = netback_changed, }; -- 2.16.6