linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: <ankita@nvidia.com>
To: <ankita@nvidia.com>, <jgg@nvidia.com>,
	<alex.williamson@redhat.com>, <naoya.horiguchi@nec.com>,
	<maz@kernel.org>, <oliver.upton@linux.dev>
Cc: <aniketa@nvidia.com>, <cjia@nvidia.com>, <kwankhede@nvidia.com>,
	<targupta@nvidia.com>, <vsethi@nvidia.com>, <acurrid@nvidia.com>,
	<apopple@nvidia.com>, <jhubbard@nvidia.com>, <danw@nvidia.com>,
	<kvm@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>, <linux-mm@kvack.org>
Subject: [PATCH v3 0/6] Expose GPU memory as coherently CPU accessible
Date: Wed, 5 Apr 2023 11:01:28 -0700	[thread overview]
Message-ID: <20230405180134.16932-1-ankita@nvidia.com> (raw)

From: Ankit Agrawal <ankita@nvidia.com>

NVIDIA's upcoming Grace Hopper Superchip provides a PCI-like device
for the on-chip GPU that is the logical OS representation of the
internal propritary cache coherent interconnect.

This representation has a number of limitations compared to a real PCI
device, in particular, it does not model the coherent GPU memory
aperture as a PCI config space BAR, and PCI doesn't know anything
about cacheable memory types.

Provide a VFIO PCI variant driver that adapts the unique PCI
representation into a more standard PCI representation facing
userspace. The GPU memory aperture is obtained from ACPI, according to
the FW specification, and exported to userspace as the VFIO_REGION
that covers the first PCI BAR. qemu will naturally generate a PCI
device in the VM where the cacheable aperture is reported in BAR1.

Since this memory region is actually cache coherent with the CPU, the
VFIO variant driver will mmap it into VMA using a cacheable mapping.

As this is the first time an ARM environment has placed cacheable
non-struct page backed memory (eg from remap_pfn_range) into a KVM
page table, fix a bug in ARM KVM where it does not copy the cacheable
memory attributes from non-struct page backed PTEs to ensure the guest
also gets a cacheable mapping.

Finally, the cacheable memory can participate in memory failure
handling. ECC failures on this memory will trigger the normal ARM
mechanism to get into memory-failure.c. Since this memory is not
backed by struct page create a mechanism to route the memory-failure's
physical address to the VMA owner so that a SIGBUS can be generated
toward the correct process. This works with the existing KVM/qemu
handling for memory failure reporting toward a guest.

This goes along with a qemu series to provides the necessary
implementation of the Grace Hopper Superchip firmware specification so
that the guest operating system can see the correct ACPI modeling for
the coherent GPU device.
https://github.com/qemu/qemu/compare/master...ankita-nv:qemu:dev-ankit/cohmem-0330

Applied and tested over v6.3-rc4.

Ankit Agrawal (6):
  kvm: determine memory type from VMA
  vfio/nvgpu: expose GPU device memory as BAR1
  mm: handle poisoning of pfn without struct pages
  mm: Add poison error check in fixup_user_fault() for mapped PFN
  mm: Change ghes code to allow poison of non-struct PFN
  vfio/nvgpu: register device memory for poison handling

 MAINTAINERS                          |   6 +
 arch/arm64/include/asm/kvm_pgtable.h |   8 +-
 arch/arm64/include/asm/memory.h      |   6 +-
 arch/arm64/kvm/hyp/pgtable.c         |  16 +-
 arch/arm64/kvm/mmu.c                 |  27 +-
 drivers/acpi/apei/ghes.c             |  12 +-
 drivers/vfio/pci/Kconfig             |   2 +
 drivers/vfio/pci/Makefile            |   2 +
 drivers/vfio/pci/nvgpu/Kconfig       |  10 +
 drivers/vfio/pci/nvgpu/Makefile      |   3 +
 drivers/vfio/pci/nvgpu/main.c        | 359 +++++++++++++++++++++++++++
 include/linux/memory-failure.h       |  22 ++
 include/linux/mm.h                   |   1 +
 include/ras/ras_event.h              |   1 +
 mm/gup.c                             |   2 +-
 mm/memory-failure.c                  | 148 +++++++++--
 virt/kvm/kvm_main.c                  |   6 +
 17 files changed, 586 insertions(+), 45 deletions(-)
 create mode 100644 drivers/vfio/pci/nvgpu/Kconfig
 create mode 100644 drivers/vfio/pci/nvgpu/Makefile
 create mode 100644 drivers/vfio/pci/nvgpu/main.c
 create mode 100644 include/linux/memory-failure.h

-- 
2.17.1



             reply	other threads:[~2023-04-05 18:02 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-05 18:01 ankita [this message]
2023-04-05 18:01 ` [PATCH v3 1/6] kvm: determine memory type from VMA ankita
2023-04-12 12:43   ` Marc Zyngier
2023-04-12 13:01     ` Jason Gunthorpe
2023-05-31 11:35       ` Catalin Marinas
2023-06-14 12:44         ` Jason Gunthorpe
2023-07-14  8:10         ` Benjamin Herrenschmidt
2023-07-16 15:09           ` Catalin Marinas
2023-07-16 22:30             ` Jason Gunthorpe
2023-07-17 18:35               ` Alex Williamson
2023-07-25  6:18                 ` Benjamin Herrenschmidt
2023-04-05 18:01 ` [PATCH v3 2/6] vfio/nvgpu: expose GPU device memory as BAR1 ankita
2023-04-05 21:07   ` kernel test robot
2023-04-05 18:01 ` [PATCH v3 3/6] mm: handle poisoning of pfn without struct pages ankita
2023-04-05 21:07   ` kernel test robot
2023-05-09  9:51   ` HORIGUCHI NAOYA(堀口 直也)
2023-05-15 11:18     ` Ankit Agrawal
2023-05-23  5:43       ` HORIGUCHI NAOYA(堀口 直也)
2023-04-05 18:01 ` [PATCH v3 4/6] mm: Add poison error check in fixup_user_fault() for mapped PFN ankita
2023-04-05 18:01 ` [PATCH v3 5/6] mm: Change ghes code to allow poison of non-struct PFN ankita
2023-04-05 18:01 ` [PATCH v3 6/6] vfio/nvgpu: register device memory for poison handling ankita
2023-04-05 20:24   ` Zhi Wang
2023-04-05 21:50   ` kernel test robot
2023-05-24  9:53   ` Dan Carpenter
2023-04-06 12:07 ` [PATCH v3 0/6] Expose GPU memory as coherently CPU accessible David Hildenbrand
2023-04-12  8:43   ` Ankit Agrawal
2023-04-12  9:48     ` Marc Zyngier
2023-04-12 12:28 ` Marc Zyngier
2023-04-12 12:53   ` Jason Gunthorpe
2023-04-13  9:52     ` Marc Zyngier
2023-04-13 13:19       ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230405180134.16932-1-ankita@nvidia.com \
    --to=ankita@nvidia.com \
    --cc=acurrid@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=aniketa@nvidia.com \
    --cc=apopple@nvidia.com \
    --cc=cjia@nvidia.com \
    --cc=danw@nvidia.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maz@kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=oliver.upton@linux.dev \
    --cc=targupta@nvidia.com \
    --cc=vsethi@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).