From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3589C433E2 for ; Fri, 28 Aug 2020 18:26:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B67082074A for ; Fri, 28 Aug 2020 18:26:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="vOYlFM74" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728074AbgH1S04 (ORCPT ); Fri, 28 Aug 2020 14:26:56 -0400 Received: from smtp-fw-6002.amazon.com ([52.95.49.90]:42402 "EHLO smtp-fw-6002.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726714AbgH1S0z (ORCPT ); Fri, 28 Aug 2020 14:26:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1598639213; x=1630175213; h=date:from:to:subject:message-id:references:mime-version: in-reply-to; bh=YlXbxw/O9okB0AahvRarA2ldWVFp/ZEo5xO3UByBIcM=; b=vOYlFM74MDWVojXBzRaAMsJzXq8viV8GqMjP0GfSqwMQMt4Fsfu5kvyT G4sNkn34osHmRQs+u3Djc0Htnh6k60rMHAm4muSyEzGIlN4hNGEhLxAhY kjn9nd05R5GyC930LiAs1X2uBI4klZenva+N97hQWye2cK4O0JZOlAish M=; X-IronPort-AV: E=Sophos;i="5.76,364,1592870400"; d="scan'208";a="50648672" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-2c-c6afef2e.us-west-2.amazon.com) ([10.43.8.6]) by smtp-border-fw-out-6002.iad6.amazon.com with ESMTP; 28 Aug 2020 18:26:50 +0000 Received: from EX13MTAUWA001.ant.amazon.com (pdx4-ws-svc-p6-lb7-vlan2.pdx.amazon.com [10.170.41.162]) by email-inbound-relay-2c-c6afef2e.us-west-2.amazon.com (Postfix) with ESMTPS id C3D3AA2536; Fri, 28 Aug 2020 18:26:48 +0000 (UTC) Received: from EX13D10UWA003.ant.amazon.com (10.43.160.248) by EX13MTAUWA001.ant.amazon.com (10.43.160.58) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 28 Aug 2020 18:26:40 +0000 Received: from EX13MTAUWA001.ant.amazon.com (10.43.160.58) by EX13D10UWA003.ant.amazon.com (10.43.160.248) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 28 Aug 2020 18:26:40 +0000 Received: from dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com (172.22.96.68) by mail-relay.amazon.com (10.43.160.118) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Fri, 28 Aug 2020 18:26:39 +0000 Received: by dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com (Postfix, from userid 4335130) id 2E14F4087C; Fri, 28 Aug 2020 18:26:40 +0000 (UTC) Date: Fri, 28 Aug 2020 18:26:40 +0000 From: Anchal Agarwal To: , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: Re: [PATCH v3 00/11] Fix PM hibernation in Xen guests Message-ID: <20200828182640.GA20719@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 21, 2020 at 10:22:43PM +0000, Anchal Agarwal wrote: > Hello, > This series fixes PM hibernation for hvm guests running on xen hypervisor. > The running guest could now be hibernated and resumed successfully at a > later time. The fixes for PM hibernation are added to block and > network device drivers i.e xen-blkfront and xen-netfront. Any other driver > that needs to add S4 support if not already, can follow same method of > introducing freeze/thaw/restore callbacks. > The patches had been tested against upstream kernel and xen4.11. Large > scale testing is also done on Xen based Amazon EC2 instances. All this testing > involved running memory exhausting workload in the background. > > Doing guest hibernation does not involve any support from hypervisor and > this way guest has complete control over its state. Infrastructure > restrictions for saving up guest state can be overcome by guest initiated > hibernation. > > These patches were send out as RFC before and all the feedback had been > incorporated in the patches. The last v1 & v2 could be found here: > > [v1]: https://lkml.org/lkml/2020/5/19/1312 > [v2]: https://lkml.org/lkml/2020/7/2/995 > All comments and feedback from v2 had been incorporated in v3 series. > > Known issues: > 1.KASLR causes intermittent hibernation failures. VM fails to resumes and > has to be restarted. I will investigate this issue separately and shouldn't > be a blocker for this patch series. > 2. During hibernation, I observed sometimes that freezing of tasks fails due > to busy XFS workqueuei[xfs-cil/xfs-sync]. This is also intermittent may be 1 > out of 200 runs and hibernation is aborted in this case. Re-trying hibernation > may work. Also, this is a known issue with hibernation and some > filesystems like XFS has been discussed by the community for years with not an > effectve resolution at this point. > > Testing How to: > --------------- > 1. Setup xen hypervisor on a physical machine[ I used Ubuntu 16.04 +upstream > xen-4.11] > 2. Bring up a HVM guest w/t kernel compiled with hibernation patches > [I used ubuntu18.04 netboot bionic images and also Amazon Linux on-prem images]. > 3. Create a swap file size=RAM size > 4. Update grub parameters and reboot > 5. Trigger pm-hibernation from within the VM > > Example: > Set up a file-backed swap space. Swap file size>=Total memory on the system > sudo dd if=/dev/zero of=/swap bs=$(( 1024 * 1024 )) count=4096 # 4096MiB > sudo chmod 600 /swap > sudo mkswap /swap > sudo swapon /swap > > Update resume device/resume offset in grub if using swap file: > resume=/dev/xvda1 resume_offset=200704 no_console_suspend=1 > > Execute: > -------- > sudo pm-hibernate > OR > echo disk > /sys/power/state && echo reboot > /sys/power/disk > > Compute resume offset code: > " > #!/usr/bin/env python > import sys > import array > import fcntl > > #swap file > f = open(sys.argv[1], 'r') > buf = array.array('L', [0]) > > #FIBMAP > ret = fcntl.ioctl(f.fileno(), 0x01, buf) > print buf[0] > " > > Aleksei Besogonov (1): > PM / hibernate: update the resume offset on SNAPSHOT_SET_SWAP_AREA > > Anchal Agarwal (4): > x86/xen: Introduce new function to map HYPERVISOR_shared_info on > Resume > x86/xen: save and restore steal clock during PM hibernation > xen: Introduce wrapper for save/restore sched clock offset > xen: Update sched clock offset to avoid system instability in > hibernation > > Munehisa Kamata (5): > xen/manage: keep track of the on-going suspend mode > xenbus: add freeze/thaw/restore callbacks support > x86/xen: add system core suspend and resume callbacks > xen-blkfront: add callbacks for PM suspend and hibernation > xen-netfront: add callbacks for PM suspend and hibernation > > Thomas Gleixner (1): > genirq: Shutdown irq chips in suspend/resume during hibernation > > arch/x86/xen/enlighten_hvm.c | 7 +++ > arch/x86/xen/suspend.c | 63 ++++++++++++++++++++ > arch/x86/xen/time.c | 15 ++++- > arch/x86/xen/xen-ops.h | 3 + > drivers/block/xen-blkfront.c | 122 ++++++++++++++++++++++++++++++++++++-- > drivers/net/xen-netfront.c | 96 +++++++++++++++++++++++++++++- > drivers/xen/events/events_base.c | 1 + > drivers/xen/manage.c | 46 ++++++++++++++ > drivers/xen/xenbus/xenbus_probe.c | 96 +++++++++++++++++++++++++----- > include/linux/irq.h | 2 + > include/xen/xen-ops.h | 3 + > include/xen/xenbus.h | 3 + > kernel/irq/chip.c | 2 +- > kernel/irq/internals.h | 1 + > kernel/irq/pm.c | 31 +++++++--- > kernel/power/user.c | 7 ++- > 16 files changed, 464 insertions(+), 34 deletions(-) > > -- > 2.16.6 > A gentle ping on the series in case there is any more feedback or can we plan to merge this? I can then send the series with minor fixes pointed by tglx@ Thanks, Anchal From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1972BC433E2 for ; Fri, 28 Aug 2020 18:27:16 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D85622074A for ; Fri, 28 Aug 2020 18:27:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="vOYlFM74" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D85622074A Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=amazon.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1kBj5T-0002d2-Sq; Fri, 28 Aug 2020 18:26:55 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1kBj5S-0002cv-4K for xen-devel@lists.xenproject.org; Fri, 28 Aug 2020 18:26:54 +0000 X-Inumbo-ID: c3f55996-c44f-442b-9168-ed2791caf7c0 Received: from smtp-fw-6002.amazon.com (unknown [52.95.49.90]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id c3f55996-c44f-442b-9168-ed2791caf7c0; Fri, 28 Aug 2020 18:26:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1598639213; x=1630175213; h=date:from:to:subject:message-id:references:mime-version: in-reply-to; bh=YlXbxw/O9okB0AahvRarA2ldWVFp/ZEo5xO3UByBIcM=; b=vOYlFM74MDWVojXBzRaAMsJzXq8viV8GqMjP0GfSqwMQMt4Fsfu5kvyT G4sNkn34osHmRQs+u3Djc0Htnh6k60rMHAm4muSyEzGIlN4hNGEhLxAhY kjn9nd05R5GyC930LiAs1X2uBI4klZenva+N97hQWye2cK4O0JZOlAish M=; X-IronPort-AV: E=Sophos;i="5.76,364,1592870400"; d="scan'208";a="50648672" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-2c-c6afef2e.us-west-2.amazon.com) ([10.43.8.6]) by smtp-border-fw-out-6002.iad6.amazon.com with ESMTP; 28 Aug 2020 18:26:50 +0000 Received: from EX13MTAUWA001.ant.amazon.com (pdx4-ws-svc-p6-lb7-vlan2.pdx.amazon.com [10.170.41.162]) by email-inbound-relay-2c-c6afef2e.us-west-2.amazon.com (Postfix) with ESMTPS id C3D3AA2536; Fri, 28 Aug 2020 18:26:48 +0000 (UTC) Received: from EX13D10UWA003.ant.amazon.com (10.43.160.248) by EX13MTAUWA001.ant.amazon.com (10.43.160.58) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 28 Aug 2020 18:26:40 +0000 Received: from EX13MTAUWA001.ant.amazon.com (10.43.160.58) by EX13D10UWA003.ant.amazon.com (10.43.160.248) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 28 Aug 2020 18:26:40 +0000 Received: from dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com (172.22.96.68) by mail-relay.amazon.com (10.43.160.118) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Fri, 28 Aug 2020 18:26:39 +0000 Received: by dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com (Postfix, from userid 4335130) id 2E14F4087C; Fri, 28 Aug 2020 18:26:40 +0000 (UTC) Date: Fri, 28 Aug 2020 18:26:40 +0000 From: Anchal Agarwal To: , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: Re: [PATCH v3 00/11] Fix PM hibernation in Xen guests Message-ID: <20200828182640.GA20719@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Precedence: Bulk X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.29 List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" On Fri, Aug 21, 2020 at 10:22:43PM +0000, Anchal Agarwal wrote: > Hello, > This series fixes PM hibernation for hvm guests running on xen hypervisor. > The running guest could now be hibernated and resumed successfully at a > later time. The fixes for PM hibernation are added to block and > network device drivers i.e xen-blkfront and xen-netfront. Any other driver > that needs to add S4 support if not already, can follow same method of > introducing freeze/thaw/restore callbacks. > The patches had been tested against upstream kernel and xen4.11. Large > scale testing is also done on Xen based Amazon EC2 instances. All this testing > involved running memory exhausting workload in the background. > > Doing guest hibernation does not involve any support from hypervisor and > this way guest has complete control over its state. Infrastructure > restrictions for saving up guest state can be overcome by guest initiated > hibernation. > > These patches were send out as RFC before and all the feedback had been > incorporated in the patches. The last v1 & v2 could be found here: > > [v1]: https://lkml.org/lkml/2020/5/19/1312 > [v2]: https://lkml.org/lkml/2020/7/2/995 > All comments and feedback from v2 had been incorporated in v3 series. > > Known issues: > 1.KASLR causes intermittent hibernation failures. VM fails to resumes and > has to be restarted. I will investigate this issue separately and shouldn't > be a blocker for this patch series. > 2. During hibernation, I observed sometimes that freezing of tasks fails due > to busy XFS workqueuei[xfs-cil/xfs-sync]. This is also intermittent may be 1 > out of 200 runs and hibernation is aborted in this case. Re-trying hibernation > may work. Also, this is a known issue with hibernation and some > filesystems like XFS has been discussed by the community for years with not an > effectve resolution at this point. > > Testing How to: > --------------- > 1. Setup xen hypervisor on a physical machine[ I used Ubuntu 16.04 +upstream > xen-4.11] > 2. Bring up a HVM guest w/t kernel compiled with hibernation patches > [I used ubuntu18.04 netboot bionic images and also Amazon Linux on-prem images]. > 3. Create a swap file size=RAM size > 4. Update grub parameters and reboot > 5. Trigger pm-hibernation from within the VM > > Example: > Set up a file-backed swap space. Swap file size>=Total memory on the system > sudo dd if=/dev/zero of=/swap bs=$(( 1024 * 1024 )) count=4096 # 4096MiB > sudo chmod 600 /swap > sudo mkswap /swap > sudo swapon /swap > > Update resume device/resume offset in grub if using swap file: > resume=/dev/xvda1 resume_offset=200704 no_console_suspend=1 > > Execute: > -------- > sudo pm-hibernate > OR > echo disk > /sys/power/state && echo reboot > /sys/power/disk > > Compute resume offset code: > " > #!/usr/bin/env python > import sys > import array > import fcntl > > #swap file > f = open(sys.argv[1], 'r') > buf = array.array('L', [0]) > > #FIBMAP > ret = fcntl.ioctl(f.fileno(), 0x01, buf) > print buf[0] > " > > Aleksei Besogonov (1): > PM / hibernate: update the resume offset on SNAPSHOT_SET_SWAP_AREA > > Anchal Agarwal (4): > x86/xen: Introduce new function to map HYPERVISOR_shared_info on > Resume > x86/xen: save and restore steal clock during PM hibernation > xen: Introduce wrapper for save/restore sched clock offset > xen: Update sched clock offset to avoid system instability in > hibernation > > Munehisa Kamata (5): > xen/manage: keep track of the on-going suspend mode > xenbus: add freeze/thaw/restore callbacks support > x86/xen: add system core suspend and resume callbacks > xen-blkfront: add callbacks for PM suspend and hibernation > xen-netfront: add callbacks for PM suspend and hibernation > > Thomas Gleixner (1): > genirq: Shutdown irq chips in suspend/resume during hibernation > > arch/x86/xen/enlighten_hvm.c | 7 +++ > arch/x86/xen/suspend.c | 63 ++++++++++++++++++++ > arch/x86/xen/time.c | 15 ++++- > arch/x86/xen/xen-ops.h | 3 + > drivers/block/xen-blkfront.c | 122 ++++++++++++++++++++++++++++++++++++-- > drivers/net/xen-netfront.c | 96 +++++++++++++++++++++++++++++- > drivers/xen/events/events_base.c | 1 + > drivers/xen/manage.c | 46 ++++++++++++++ > drivers/xen/xenbus/xenbus_probe.c | 96 +++++++++++++++++++++++++----- > include/linux/irq.h | 2 + > include/xen/xen-ops.h | 3 + > include/xen/xenbus.h | 3 + > kernel/irq/chip.c | 2 +- > kernel/irq/internals.h | 1 + > kernel/irq/pm.c | 31 +++++++--- > kernel/power/user.c | 7 ++- > 16 files changed, 464 insertions(+), 34 deletions(-) > > -- > 2.16.6 > A gentle ping on the series in case there is any more feedback or can we plan to merge this? I can then send the series with minor fixes pointed by tglx@ Thanks, Anchal