linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] kvm "fake DAX" device
@ 2018-08-31 13:30 Pankaj Gupta
  2018-08-31 13:30 ` [PATCH 1/3] nd: move nd_region to common header Pankaj Gupta
                   ` (3 more replies)
  0 siblings, 4 replies; 27+ messages in thread
From: Pankaj Gupta @ 2018-08-31 13:30 UTC (permalink / raw)
  To: linux-kernel, kvm, qemu-devel, linux-nvdimm
  Cc: jack, stefanha, dan.j.williams, riel, nilal, kwolf, pbonzini,
	ross.zwisler, david, xiaoguangrong.eric, hch, mst,
	niteshnarayanlal, lcapitulino, imammedo, eblake, pagupta

 This patch series has implementation for "fake DAX". 
 "fake DAX" is fake persistent memory(nvdimm) in guest 
 which allows to bypass the guest page cache. This also
 implements a VIRTIO based asynchronous flush mechanism.  
 
 Sharing guest driver and qemu device changes in separate 
 patch sets for easy review and it has been tested together. 
 
 Details of project idea for 'fake DAX' flushing interface 
 is shared [2] & [3].

 Implementation is divided into two parts:
 New virtio pmem guest driver and qemu code changes for new 
 virtio pmem paravirtualized device.

1. Guest virtio-pmem kernel driver
---------------------------------
   - Reads persistent memory range from paravirt device and 
     registers with 'nvdimm_bus'.  
   - 'nvdimm/pmem' driver uses this information to allocate 
     persistent memory region and setup filesystem operations 
     to the allocated memory. 
   - virtio pmem driver implements asynchronous flushing 
     interface to flush from guest to host.

2. Qemu virtio-pmem device
---------------------------------
   - Creates virtio pmem device and exposes a memory range to 
     KVM guest. 
   - At host side this is file backed memory which acts as 
     persistent memory. 
   - Qemu side flush uses aio thread pool API's and virtio 
     for asynchronous guest multi request handling. 

   David Hildenbrand CCed also posted a modified version[4] of 
   qemu virtio-pmem code based on updated Qemu memory device API. 

 Virtio-pmem errors handling:
 ----------------------------------------
  Checked behaviour of virtio-pmem for below types of errors
  Need suggestions on expected behaviour for handling these errors?

  - Hardware Errors: Uncorrectable recoverable Errors: 
  a] virtio-pmem: 
    - As per current logic if error page belongs to Qemu process, 
      host MCE handler isolates(hwpoison) that page and send SIGBUS. 
      Qemu SIGBUS handler injects exception to KVM guest. 
    - KVM guest then isolates the page and send SIGBUS to guest 
      userspace process which has mapped the page. 
  
  b] Existing implementation for ACPI pmem driver: 
    - Handles such errors with MCE notifier and creates a list 
      of bad blocks. Read/direct access DAX operation return EIO 
      if accessed memory page fall in bad block list.
    - It also starts backgound scrubbing.  
    - Similar functionality can be reused in virtio-pmem with MCE 
      notifier but without scrubbing(no ACPI/ARS)? Need inputs to 
      confirm if this behaviour is ok or needs any change?

Changes from RFC v3: [1]
- Rebase to latest upstream - Luiz
- Call ndregion->flush in place of nvdimm_flush- Luiz
- kmalloc return check - Luiz
- virtqueue full handling - Stefan
- Don't map entire virtio_pmem_req to device - Stefan
- request leak,correct sizeof req- Stefan
- Move declaration to virtio_pmem.c

Changes from RFC v2:
- Add flush function in the nd_region in place of switching
  on a flag - Dan & Stefan
- Add flush completion function with proper locking and wait
  for host side flush completion - Stefan & Dan
- Keep userspace API in uapi header file - Stefan, MST
- Use LE fields & New device id - MST
- Indentation & spacing suggestions - MST & Eric
- Remove extra header files & add licensing - Stefan

Changes from RFC v1:
- Reuse existing 'pmem' code for registering persistent 
  memory and other operations instead of creating an entirely 
  new block driver.
- Use VIRTIO driver to register memory information with 
  nvdimm_bus and create region_type accordingly. 
- Call VIRTIO flush from existing pmem driver.

Pankaj Gupta (3):
   nd: move nd_region to common header
   libnvdimm: nd_region flush callback support
   virtio-pmem: Add virtio-pmem guest driver

[1] https://lkml.org/lkml/2018/7/13/102
[2] https://www.spinics.net/lists/kvm/msg149761.html
[3] https://www.spinics.net/lists/kvm/msg153095.html  
[4] https://marc.info/?l=qemu-devel&m=153555721901824&w=2

 drivers/acpi/nfit/core.c         |    7 -
 drivers/nvdimm/claim.c           |    3 
 drivers/nvdimm/nd.h              |   39 -----
 drivers/nvdimm/pmem.c            |   12 +
 drivers/nvdimm/region_devs.c     |   12 +
 drivers/virtio/Kconfig           |    9 +
 drivers/virtio/Makefile          |    1 
 drivers/virtio/virtio_pmem.c     |  255 +++++++++++++++++++++++++++++++++++++++
 include/linux/libnvdimm.h        |    4 
 include/linux/nd.h               |   40 ++++++
 include/uapi/linux/virtio_ids.h  |    1 
 include/uapi/linux/virtio_pmem.h |   40 ++++++
 12 files changed, 374 insertions(+), 49 deletions(-)

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2018-09-27 15:56 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-31 13:30 [PATCH 0/3] kvm "fake DAX" device Pankaj Gupta
2018-08-31 13:30 ` [PATCH 1/3] nd: move nd_region to common header Pankaj Gupta
2018-09-22  0:47   ` Dan Williams
2018-09-24 11:40     ` Pankaj Gupta
2018-08-31 13:30 ` [PATCH 2/3] libnvdimm: nd_region flush callback support Pankaj Gupta
2018-09-04 15:29   ` kbuild test robot
2018-09-05  8:40     ` Pankaj Gupta
2018-09-22  0:43   ` Dan Williams
2018-09-24 11:07     ` Pankaj Gupta
2018-08-31 13:30 ` [PATCH 3/3] virtio-pmem: Add virtio pmem driver Pankaj Gupta
2018-09-04 15:17   ` kbuild test robot
2018-09-05  8:34     ` Pankaj Gupta
2018-09-05 12:02   ` kbuild test robot
2018-09-12 16:54   ` Luiz Capitulino
2018-09-13  6:58     ` [Qemu-devel] " Pankaj Gupta
2018-09-13 12:19       ` Luiz Capitulino
2018-09-14 12:13         ` Pankaj Gupta
2018-09-22  1:08   ` Dan Williams
2018-09-24  9:41     ` Pankaj Gupta
2018-09-27 13:06       ` Pankaj Gupta
2018-09-27 15:55         ` Dan Williams
2018-08-31 13:30 ` [PATCH] qemu: Add virtio pmem device Pankaj Gupta
2018-09-12 16:57   ` Luiz Capitulino
2018-09-13  7:06     ` Pankaj Gupta
2018-09-13 12:22       ` Luiz Capitulino
2018-09-20 11:21   ` David Hildenbrand
2018-09-20 12:03     ` [Qemu-devel] " Pankaj Gupta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).