All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anchal Agarwal <anchalag@amazon.com>
To: <tglx@linutronix.de>, <mingo@redhat.com>, <bp@alien8.de>,
	<hpa@zytor.com>, <x86@kernel.org>, <boris.ostrovsky@oracle.com>,
	<jgross@suse.com>, <linux-pm@vger.kernel.org>,
	<linux-mm@kvack.org>, <kamatam@amazon.com>,
	<sstabellini@kernel.org>, <konrad.wilk@oracle.co>,
	<roger.pau@citrix.com>, <axboe@kernel.dk>, <davem@davemloft.net>,
	<rjw@rjwysocki.net>, <len.brown@intel.com>, <pavel@ucw.cz>,
	<peterz@infradead.org>, <eduval@amazon.com>, <sblbir@amazon.com>,
	<anchalag@amazon.com>, <xen-devel@lists.xenproject.org>,
	<vkuznets@redhat.com>, <netdev@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>,
	<Woodhouse@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com>,
	<dwmw@amazon.co.uk>, <fllinden@amaozn.com>
Cc: <anchalag@amazon.com>
Subject: [RFC PATCH V2 00/11] Enable PM hibernation on guest VMs
Date: Tue, 7 Jan 2020 23:36:24 +0000	[thread overview]
Message-ID: <20200107233624.GA16802@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com> (raw)

Hello,
I am sending out a V2 version of series of patches that implements guest 
PM hibernation.
These guests are running on xen hypervisor. The patches had been tested
against mainstream kernel. EC2 instance hibernation feature is provided 
to the AWS EC2 customers. PM hibernation uses swap space carved out within 
the guest[or can be a separate partition], where hibernation image is 
stored and restored from.

Why is guest hibenration needed:
Doing guest hibernation does not involve any support from hypervisor and this
way guest has complete control over its state. Infrastructure restrictions like
saving up guest state etc can be overcome by guest initiated hibernation.

This series includes some improvements over RFC series sent last year:
https://lists.xenproject.org/archives/html/xen-devel/2018-06/msg00823.html

Any comments or suggestions are welcome.

Changelog v2:
1. Removed timeout/request present on the ring in xen-blkfront during blkfront freeze
2. Fixed restoring of PIRQs which was apparently working for 4.9 kernels but not for
newer kernel. [Legacy irqs were no longer restored after hibernation introduced with
this commit "020db9d3c1dc0"]
3. Merged couple of related patches to make the code more coherent and readable
4. Code refactoring
5. Sched clock fix when hibernating guest is under heavy CPU load
Note: Under very rare circumstances we see resume failures with KASLR enabled only
on xen instances.  We are roughly seeing 3% failures [>1000 runs] when testing with
various instance sizes and some workload running on each instance. I am currently
investigating the issue as to confirm if its a xen issue or kernel issue.
However, it should not hold back anyone from reviewing/accepting these patches.

Testing done:
All the testing is done using amazon linux images w/t stock upstream kernel
installed. All testing is done for multiple hibernation cycle.

i. multiple loops[~100] of hibernation in disk mode <reboot> w/t 5.4 guest kernel + 4.11 xen
ii. Hibernation tested with memory stress tester running in background on smaller and
larger instance sizes on EC2.[>500 runs]
iii. Testing is also done on physical host machine[Ubuntu18.04/4.15 kernel/stock xen-4.6]
running amazon linux 2 OS as guest VM with multiple queues.
iv. Ran dd to write a large file with bs=1k and hibernated multiple times

Testing How to:
---------------
Example:
Set up a file-backed swap space. Swap file size>=Total memory on the system
sudo dd if=/dev/zero of=/swap bs=$(( 1024 * 1024 )) count=4096 # 4096MiB
sudo chmod 600 /swap
sudo mkswap /swap
sudo swapon /swap

Update resume device/resume offset in grub if using swap file:
resume=/dev/xvda1 resume_offset=200704

Execute:
--------
sudo pm-hibernate
OR
echo disk > /sys/power/state && echo reboot > /sys/power/disk

Compute resume offset code:
"
#!/usr/bin/env python
import sys
import array
import fcntl

#swap file
f = open(sys.argv[1], 'r')
buf = array.array('L', [0])

#FIBMAP
ret = fcntl.ioctl(f.fileno(), 0x01, buf)
print buf[0]
"

Aleksei Besogonov (1):
  PM / hibernate: update the resume offset on SNAPSHOT_SET_SWAP_AREA

Anchal Agarwal (2):
  x86/xen: Introduce new function to map HYPERVISOR_shared_info on
    Resume
  xen: Clear IRQD_IRQ_STARTED flag during shutdown PIRQs

Eduardo Valentin (1):
  x86: tsc: avoid system instability in hibernation

Munehisa Kamata (7):
  xen/manage: keep track of the on-going suspend mode
  xenbus: add freeze/thaw/restore callbacks support
  x86/xen: add system core suspend and resume callbacks
  xen-netfront: add callbacks for PM suspend and hibernation support
  xen-blkfront: add callbacks for PM suspend and hibernation
  x86/xen: save and restore steal clock during hibernation
  x86/xen: close event channels for PIRQs in system core suspend
    callback

 arch/x86/kernel/tsc.c             |  29 ++++++++++
 arch/x86/xen/enlighten_hvm.c      |   8 +++
 arch/x86/xen/suspend.c            |  66 +++++++++++++++++++++
 arch/x86/xen/time.c               |   3 +
 arch/x86/xen/xen-ops.h            |   1 +
 drivers/block/xen-blkfront.c      | 119 +++++++++++++++++++++++++++++++++++---
 drivers/net/xen-netfront.c        |  98 ++++++++++++++++++++++++++++++-
 drivers/xen/events/events_base.c  |  13 +++++
 drivers/xen/manage.c              |  73 +++++++++++++++++++++++
 drivers/xen/time.c                |  28 ++++++++-
 drivers/xen/xenbus/xenbus_probe.c |  99 +++++++++++++++++++++++++------
 include/linux/irq.h               |   1 +
 include/linux/sched/clock.h       |   5 ++
 include/xen/events.h              |   1 +
 include/xen/xen-ops.h             |   8 +++
 include/xen/xenbus.h              |   3 +
 kernel/irq/chip.c                 |   3 +-
 kernel/power/user.c               |   6 +-
 kernel/sched/clock.c              |   4 +-
 19 files changed, 537 insertions(+), 31 deletions(-)

-- 
2.15.3.AMZN


WARNING: multiple messages have this Message-ID (diff)
From: Anchal Agarwal <anchalag@amazon.com>
To: <tglx@linutronix.de>, <mingo@redhat.com>, <bp@alien8.de>,
	<hpa@zytor.com>,  <x86@kernel.org>, <boris.ostrovsky@oracle.com>,
	<jgross@suse.com>, <linux-pm@vger.kernel.org>,
	<linux-mm@kvack.org>, <kamatam@amazon.com>,
	<sstabellini@kernel.org>, <konrad.wilk@oracle.co>,
	<roger.pau@citrix.com>, <axboe@kernel.dk>, <davem@davemloft.net>,
	<rjw@rjwysocki.net>, <len.brown@intel.com>, <pavel@ucw.cz>,
	<peterz@infradead.org>, <eduval@amazon.com>, <sblbir@amazon.com>,
	<anchalag@amazon.com>, <xen-devel@lists.xenproject.org>,
	<vkuznets@redhat.com>, <netdev@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>,
	<Woodhouse@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com>,
	<dwmw@amazon.co.uk>, <fllinden@amaozn.com>
Cc: anchalag@amazon.com
Subject: [Xen-devel] [RFC PATCH V2 00/11] Enable PM hibernation on guest VMs
Date: Tue, 7 Jan 2020 23:36:24 +0000	[thread overview]
Message-ID: <20200107233624.GA16802@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com> (raw)

Hello,
I am sending out a V2 version of series of patches that implements guest 
PM hibernation.
These guests are running on xen hypervisor. The patches had been tested
against mainstream kernel. EC2 instance hibernation feature is provided 
to the AWS EC2 customers. PM hibernation uses swap space carved out within 
the guest[or can be a separate partition], where hibernation image is 
stored and restored from.

Why is guest hibenration needed:
Doing guest hibernation does not involve any support from hypervisor and this
way guest has complete control over its state. Infrastructure restrictions like
saving up guest state etc can be overcome by guest initiated hibernation.

This series includes some improvements over RFC series sent last year:
https://lists.xenproject.org/archives/html/xen-devel/2018-06/msg00823.html

Any comments or suggestions are welcome.

Changelog v2:
1. Removed timeout/request present on the ring in xen-blkfront during blkfront freeze
2. Fixed restoring of PIRQs which was apparently working for 4.9 kernels but not for
newer kernel. [Legacy irqs were no longer restored after hibernation introduced with
this commit "020db9d3c1dc0"]
3. Merged couple of related patches to make the code more coherent and readable
4. Code refactoring
5. Sched clock fix when hibernating guest is under heavy CPU load
Note: Under very rare circumstances we see resume failures with KASLR enabled only
on xen instances.  We are roughly seeing 3% failures [>1000 runs] when testing with
various instance sizes and some workload running on each instance. I am currently
investigating the issue as to confirm if its a xen issue or kernel issue.
However, it should not hold back anyone from reviewing/accepting these patches.

Testing done:
All the testing is done using amazon linux images w/t stock upstream kernel
installed. All testing is done for multiple hibernation cycle.

i. multiple loops[~100] of hibernation in disk mode <reboot> w/t 5.4 guest kernel + 4.11 xen
ii. Hibernation tested with memory stress tester running in background on smaller and
larger instance sizes on EC2.[>500 runs]
iii. Testing is also done on physical host machine[Ubuntu18.04/4.15 kernel/stock xen-4.6]
running amazon linux 2 OS as guest VM with multiple queues.
iv. Ran dd to write a large file with bs=1k and hibernated multiple times

Testing How to:
---------------
Example:
Set up a file-backed swap space. Swap file size>=Total memory on the system
sudo dd if=/dev/zero of=/swap bs=$(( 1024 * 1024 )) count=4096 # 4096MiB
sudo chmod 600 /swap
sudo mkswap /swap
sudo swapon /swap

Update resume device/resume offset in grub if using swap file:
resume=/dev/xvda1 resume_offset=200704

Execute:
--------
sudo pm-hibernate
OR
echo disk > /sys/power/state && echo reboot > /sys/power/disk

Compute resume offset code:
"
#!/usr/bin/env python
import sys
import array
import fcntl

#swap file
f = open(sys.argv[1], 'r')
buf = array.array('L', [0])

#FIBMAP
ret = fcntl.ioctl(f.fileno(), 0x01, buf)
print buf[0]
"

Aleksei Besogonov (1):
  PM / hibernate: update the resume offset on SNAPSHOT_SET_SWAP_AREA

Anchal Agarwal (2):
  x86/xen: Introduce new function to map HYPERVISOR_shared_info on
    Resume
  xen: Clear IRQD_IRQ_STARTED flag during shutdown PIRQs

Eduardo Valentin (1):
  x86: tsc: avoid system instability in hibernation

Munehisa Kamata (7):
  xen/manage: keep track of the on-going suspend mode
  xenbus: add freeze/thaw/restore callbacks support
  x86/xen: add system core suspend and resume callbacks
  xen-netfront: add callbacks for PM suspend and hibernation support
  xen-blkfront: add callbacks for PM suspend and hibernation
  x86/xen: save and restore steal clock during hibernation
  x86/xen: close event channels for PIRQs in system core suspend
    callback

 arch/x86/kernel/tsc.c             |  29 ++++++++++
 arch/x86/xen/enlighten_hvm.c      |   8 +++
 arch/x86/xen/suspend.c            |  66 +++++++++++++++++++++
 arch/x86/xen/time.c               |   3 +
 arch/x86/xen/xen-ops.h            |   1 +
 drivers/block/xen-blkfront.c      | 119 +++++++++++++++++++++++++++++++++++---
 drivers/net/xen-netfront.c        |  98 ++++++++++++++++++++++++++++++-
 drivers/xen/events/events_base.c  |  13 +++++
 drivers/xen/manage.c              |  73 +++++++++++++++++++++++
 drivers/xen/time.c                |  28 ++++++++-
 drivers/xen/xenbus/xenbus_probe.c |  99 +++++++++++++++++++++++++------
 include/linux/irq.h               |   1 +
 include/linux/sched/clock.h       |   5 ++
 include/xen/events.h              |   1 +
 include/xen/xen-ops.h             |   8 +++
 include/xen/xenbus.h              |   3 +
 kernel/irq/chip.c                 |   3 +-
 kernel/power/user.c               |   6 +-
 kernel/sched/clock.c              |   4 +-
 19 files changed, 537 insertions(+), 31 deletions(-)

-- 
2.15.3.AMZN


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

             reply	other threads:[~2020-01-07 23:36 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-07 23:36 Anchal Agarwal [this message]
2020-01-07 23:36 ` [Xen-devel] [RFC PATCH V2 00/11] Enable PM hibernation on guest VMs Anchal Agarwal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200107233624.GA16802@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com \
    --to=anchalag@amazon.com \
    --cc=Woodhouse@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com \
    --cc=axboe@kernel.dk \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=davem@davemloft.net \
    --cc=dwmw@amazon.co.uk \
    --cc=eduval@amazon.com \
    --cc=fllinden@amaozn.com \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=kamatam@amazon.com \
    --cc=konrad.wilk@oracle.co \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pavel@ucw.cz \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=roger.pau@citrix.com \
    --cc=sblbir@amazon.com \
    --cc=sstabellini@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.