All of lore.kernel.org
 help / color / mirror / Atom feed
From: Haiyang Zhang via iommu <iommu@lists.linux-foundation.org>
To: "Michael Kelley (LINUX)" <mikelley@microsoft.com>,
	Tianyu Lan <ltykernel@gmail.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"luto@kernel.org" <luto@kernel.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"jgross@suse.com" <jgross@suse.com>,
	"sstabellini@kernel.org" <sstabellini@kernel.org>,
	"boris.ostrovsky@oracle.com" <boris.ostrovsky@oracle.com>,
	KY Srinivasan <kys@microsoft.com>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	Dexuan Cui <decui@microsoft.com>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"will@kernel.org" <will@kernel.org>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"kuba@kernel.org" <kuba@kernel.org>,
	"jejb@linux.ibm.com" <jejb@linux.ibm.com>,
	"martin.petersen@oracle.com" <martin.petersen@oracle.com>,
	"hch@lst.de" <hch@lst.de>,
	"m.szyprowski@samsung.com" <m.szyprowski@samsung.com>,
	"robin.murphy@arm.com" <robin.murphy@arm.com>,
	Tianyu Lan <Tianyu.Lan@microsoft.com>,
	"thomas.lendacky@amd.com" <thomas.lendacky@amd.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Cc: "parri.andrea@gmail.com" <parri.andrea@gmail.com>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"brijesh.singh@amd.com" <brijesh.singh@amd.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"konrad.wilk@oracle.com" <konrad.wilk@oracle.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"dave.hansen@intel.com" <dave.hansen@intel.com>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	vkuznets <vkuznets@redhat.com>
Subject: RE: [PATCH V2 5/6] net: netvsc: Add Isolation VM support for netvsc driver
Date: Thu, 25 Nov 2021 21:58:16 +0000	[thread overview]
Message-ID: <DM6PR21MB12926C3BC4766C78C57D9210CA629@DM6PR21MB1292.namprd21.prod.outlook.com> (raw)
In-Reply-To: <MWHPR21MB1593093B61DC506B64986B14D7619@MWHPR21MB1593.namprd21.prod.outlook.com>



> -----Original Message-----
> From: Michael Kelley (LINUX) <mikelley@microsoft.com>
> Sent: Wednesday, November 24, 2021 12:03 PM
> To: Tianyu Lan <ltykernel@gmail.com>; tglx@linutronix.de; mingo@redhat.com; bp@alien8.de;
> dave.hansen@linux.intel.com; x86@kernel.org; hpa@zytor.com; luto@kernel.org;
> peterz@infradead.org; jgross@suse.com; sstabellini@kernel.org; boris.ostrovsky@oracle.com;
> KY Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Stephen
> Hemminger <sthemmin@microsoft.com>; wei.liu@kernel.org; Dexuan Cui <decui@microsoft.com>;
> joro@8bytes.org; will@kernel.org; davem@davemloft.net; kuba@kernel.org; jejb@linux.ibm.com;
> martin.petersen@oracle.com; hch@lst.de; m.szyprowski@samsung.com; robin.murphy@arm.com;
> Tianyu Lan <Tianyu.Lan@microsoft.com>; thomas.lendacky@amd.com; xen-
> devel@lists.xenproject.org
> Cc: iommu@lists.linux-foundation.org; linux-hyperv@vger.kernel.org; linux-
> kernel@vger.kernel.org; linux-scsi@vger.kernel.org; netdev@vger.kernel.org; vkuznets
> <vkuznets@redhat.com>; brijesh.singh@amd.com; konrad.wilk@oracle.com;
> parri.andrea@gmail.com; dave.hansen@intel.com
> Subject: RE: [PATCH V2 5/6] net: netvsc: Add Isolation VM support for netvsc driver
> 
> From: Tianyu Lan <ltykernel@gmail.com> Sent: Tuesday, November 23, 2021 6:31 AM
> >
> > In Isolation VM, all shared memory with host needs to mark visible to
> > host via hvcall. vmbus_establish_gpadl() has already done it for
> > netvsc rx/tx ring buffer. The page buffer used by vmbus_sendpacket_
> > pagebuffer() stills need to be handled. Use DMA API to map/umap these
> > memory during sending/receiving packet and Hyper-V swiotlb bounce
> > buffer dma address will be returned. The swiotlb bounce buffer has
> > been masked to be visible to host during boot up.
> >
> > Allocate rx/tx ring buffer via dma_alloc_noncontiguous() in Isolation
> > VM. After calling vmbus_establish_gpadl() which marks these pages
> > visible to host, map these pages unencrypted addes space via dma_vmap_noncontiguous().
> >
> 
> The big unresolved topic is how best to do the allocation and mapping of the big netvsc
> send and receive buffers.  Let me summarize and make a recommendation.
> 
> Background
> ==========
> 1.  Each Hyper-V synthetic network device requires a large pre-allocated receive
>      buffer (defaults to 16 Mbytes) and a similar send buffer (defaults to 1 Mbyte).
> 2.  The buffers are allocated in guest memory and shared with the Hyper-V host.
>      As such, in the Hyper-V SNP environment, the memory must be unencrypted
>      and accessed in the Hyper-V guest with shared_gpa_boundary (i.e., VTOM)
>      added to the physical memory address.
> 3.  The buffers need *not* be contiguous in guest physical memory, but must be
>      contiguously mapped in guest kernel virtual space.
> 4.  Network devices may come and go during the life of the VM, so allocation of
>      these buffers and their mappings may be done after Linux has been running for
>      a long time.
> 5.  Performance of the allocation and mapping process is not an issue since it is
>      done only on synthetic network device add/remove.
> 6.  So the primary goals are an appropriate logical abstraction, code that is
>      simple and straightforward, and efficient memory usage.
> 
> Approaches
> ==========
> During the development of these patches, four approaches have been
> implemented:
> 
> 1.  Two virtual mappings:  One from vmalloc() to allocate the guest memory, and
>      the second from vmap_pfns() after adding the shared_gpa_boundary.   This is
>      implemented in Hyper-V or netvsc specific code, with no use of DMA APIs.
>      No separate list of physical pages is maintained, so for creating the second
>      mapping, the PFN list is assembled temporarily by doing virt-to-phys()
>      page-by-page on the vmalloc mapping, and then discarded because it is no
>      longer needed.  [v4 of the original patch series.]
> 
> 2.  Two virtual mappings as in (1) above, but implemented via new DMA calls
>      dma_map_decrypted() and dma_unmap_encrypted().  [v3 of the original
>      patch series.]
> 
> 3.  Two virtual mappings as in (1) above, but implemented via DMA noncontiguous
>       allocation and mapping calls, as enhanced to allow for custom map/unmap
>       implementations.  A list of physical pages is maintained in the dma_sgt_handle
>       as expected by the DMA noncontiguous API.  [New split-off patch series v1 & v2]
> 
> 4.   Single virtual mapping from vmap_pfns().  The netvsc driver allocates physical
>       memory via alloc_pages() with as much contiguity as possible, and maintains a
>       list of physical pages and ranges.   Single virtual map is setup with vmap_pfns()
>       after adding shared_gpa_boundary.  [v5 of the original patch series.]
> 
> Both implementations using DMA APIs use very little of the existing DMA machinery.  Both
> require extensions to the DMA APIs, and custom ops functions.
> While in some sense the netvsc send and receive buffers involve DMA, they do not require
> any DMA actions on a per-I/O basis.  It seems better to me to not try to fit these two
> buffers into the DMA model as a one-off.  Let's just use Hyper-V specific code to allocate
> and map them, as is done with the Hyper-V VMbus channel ring buffers.
> 
> That leaves approaches (1) and (4) above.  Between those two, (1) is simpler even though
> there are two virtual mappings.  Using alloc_pages() as in (4) is messy and there's no
> real benefit to using higher order allocations.
> (4) also requires maintaining a separate list of PFNs and ranges, which offsets some of
> the benefits to having only one virtual mapping active at any point in time.
> 
> I don't think there's a clear "right" answer, so it's a judgment call.  We've explored
> what other approaches would look like, and I'd say let's go with
> (1) as the simpler approach.  Thoughts?
> 
I agree with the following goal:
"So the primary goals are an appropriate logical abstraction, code that is
     simple and straightforward, and efficient memory usage."

And the Approach #1 looks better to me as well.

Thanks,
- Haiyang

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

WARNING: multiple messages have this Message-ID (diff)
From: Haiyang Zhang <haiyangz@microsoft.com>
To: "Michael Kelley (LINUX)" <mikelley@microsoft.com>,
	Tianyu Lan <ltykernel@gmail.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"luto@kernel.org" <luto@kernel.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"jgross@suse.com" <jgross@suse.com>,
	"sstabellini@kernel.org" <sstabellini@kernel.org>,
	"boris.ostrovsky@oracle.com" <boris.ostrovsky@oracle.com>,
	KY Srinivasan <kys@microsoft.com>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	Dexuan Cui <decui@microsoft.com>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"will@kernel.org" <will@kernel.org>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"kuba@kernel.org" <kuba@kernel.org>,
	"jejb@linux.ibm.com" <jejb@linux.ibm.com>,
	"martin.petersen@oracle.com" <martin.petersen@oracle.com>,
	"hch@lst.de" <hch@lst.de>,
	"m.szyprowski@samsung.com" <m.szyprowski@samsung.com>,
	"robin.murphy@arm.com" <robin.murphy@arm.com>,
	Tianyu Lan <Tianyu.Lan@microsoft.com>,
	"thomas.lendacky@amd.com" <thomas.lendacky@amd.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Cc: "iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	vkuznets <vkuznets@redhat.com>,
	"brijesh.singh@amd.com" <brijesh.singh@amd.com>,
	"konrad.wilk@oracle.com" <konrad.wilk@oracle.com>,
	"parri.andrea@gmail.com" <parri.andrea@gmail.com>,
	"dave.hansen@intel.com" <dave.hansen@intel.com>
Subject: RE: [PATCH V2 5/6] net: netvsc: Add Isolation VM support for netvsc driver
Date: Thu, 25 Nov 2021 21:58:16 +0000	[thread overview]
Message-ID: <DM6PR21MB12926C3BC4766C78C57D9210CA629@DM6PR21MB1292.namprd21.prod.outlook.com> (raw)
In-Reply-To: <MWHPR21MB1593093B61DC506B64986B14D7619@MWHPR21MB1593.namprd21.prod.outlook.com>



> -----Original Message-----
> From: Michael Kelley (LINUX) <mikelley@microsoft.com>
> Sent: Wednesday, November 24, 2021 12:03 PM
> To: Tianyu Lan <ltykernel@gmail.com>; tglx@linutronix.de; mingo@redhat.com; bp@alien8.de;
> dave.hansen@linux.intel.com; x86@kernel.org; hpa@zytor.com; luto@kernel.org;
> peterz@infradead.org; jgross@suse.com; sstabellini@kernel.org; boris.ostrovsky@oracle.com;
> KY Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Stephen
> Hemminger <sthemmin@microsoft.com>; wei.liu@kernel.org; Dexuan Cui <decui@microsoft.com>;
> joro@8bytes.org; will@kernel.org; davem@davemloft.net; kuba@kernel.org; jejb@linux.ibm.com;
> martin.petersen@oracle.com; hch@lst.de; m.szyprowski@samsung.com; robin.murphy@arm.com;
> Tianyu Lan <Tianyu.Lan@microsoft.com>; thomas.lendacky@amd.com; xen-
> devel@lists.xenproject.org
> Cc: iommu@lists.linux-foundation.org; linux-hyperv@vger.kernel.org; linux-
> kernel@vger.kernel.org; linux-scsi@vger.kernel.org; netdev@vger.kernel.org; vkuznets
> <vkuznets@redhat.com>; brijesh.singh@amd.com; konrad.wilk@oracle.com;
> parri.andrea@gmail.com; dave.hansen@intel.com
> Subject: RE: [PATCH V2 5/6] net: netvsc: Add Isolation VM support for netvsc driver
> 
> From: Tianyu Lan <ltykernel@gmail.com> Sent: Tuesday, November 23, 2021 6:31 AM
> >
> > In Isolation VM, all shared memory with host needs to mark visible to
> > host via hvcall. vmbus_establish_gpadl() has already done it for
> > netvsc rx/tx ring buffer. The page buffer used by vmbus_sendpacket_
> > pagebuffer() stills need to be handled. Use DMA API to map/umap these
> > memory during sending/receiving packet and Hyper-V swiotlb bounce
> > buffer dma address will be returned. The swiotlb bounce buffer has
> > been masked to be visible to host during boot up.
> >
> > Allocate rx/tx ring buffer via dma_alloc_noncontiguous() in Isolation
> > VM. After calling vmbus_establish_gpadl() which marks these pages
> > visible to host, map these pages unencrypted addes space via dma_vmap_noncontiguous().
> >
> 
> The big unresolved topic is how best to do the allocation and mapping of the big netvsc
> send and receive buffers.  Let me summarize and make a recommendation.
> 
> Background
> ==========
> 1.  Each Hyper-V synthetic network device requires a large pre-allocated receive
>      buffer (defaults to 16 Mbytes) and a similar send buffer (defaults to 1 Mbyte).
> 2.  The buffers are allocated in guest memory and shared with the Hyper-V host.
>      As such, in the Hyper-V SNP environment, the memory must be unencrypted
>      and accessed in the Hyper-V guest with shared_gpa_boundary (i.e., VTOM)
>      added to the physical memory address.
> 3.  The buffers need *not* be contiguous in guest physical memory, but must be
>      contiguously mapped in guest kernel virtual space.
> 4.  Network devices may come and go during the life of the VM, so allocation of
>      these buffers and their mappings may be done after Linux has been running for
>      a long time.
> 5.  Performance of the allocation and mapping process is not an issue since it is
>      done only on synthetic network device add/remove.
> 6.  So the primary goals are an appropriate logical abstraction, code that is
>      simple and straightforward, and efficient memory usage.
> 
> Approaches
> ==========
> During the development of these patches, four approaches have been
> implemented:
> 
> 1.  Two virtual mappings:  One from vmalloc() to allocate the guest memory, and
>      the second from vmap_pfns() after adding the shared_gpa_boundary.   This is
>      implemented in Hyper-V or netvsc specific code, with no use of DMA APIs.
>      No separate list of physical pages is maintained, so for creating the second
>      mapping, the PFN list is assembled temporarily by doing virt-to-phys()
>      page-by-page on the vmalloc mapping, and then discarded because it is no
>      longer needed.  [v4 of the original patch series.]
> 
> 2.  Two virtual mappings as in (1) above, but implemented via new DMA calls
>      dma_map_decrypted() and dma_unmap_encrypted().  [v3 of the original
>      patch series.]
> 
> 3.  Two virtual mappings as in (1) above, but implemented via DMA noncontiguous
>       allocation and mapping calls, as enhanced to allow for custom map/unmap
>       implementations.  A list of physical pages is maintained in the dma_sgt_handle
>       as expected by the DMA noncontiguous API.  [New split-off patch series v1 & v2]
> 
> 4.   Single virtual mapping from vmap_pfns().  The netvsc driver allocates physical
>       memory via alloc_pages() with as much contiguity as possible, and maintains a
>       list of physical pages and ranges.   Single virtual map is setup with vmap_pfns()
>       after adding shared_gpa_boundary.  [v5 of the original patch series.]
> 
> Both implementations using DMA APIs use very little of the existing DMA machinery.  Both
> require extensions to the DMA APIs, and custom ops functions.
> While in some sense the netvsc send and receive buffers involve DMA, they do not require
> any DMA actions on a per-I/O basis.  It seems better to me to not try to fit these two
> buffers into the DMA model as a one-off.  Let's just use Hyper-V specific code to allocate
> and map them, as is done with the Hyper-V VMbus channel ring buffers.
> 
> That leaves approaches (1) and (4) above.  Between those two, (1) is simpler even though
> there are two virtual mappings.  Using alloc_pages() as in (4) is messy and there's no
> real benefit to using higher order allocations.
> (4) also requires maintaining a separate list of PFNs and ranges, which offsets some of
> the benefits to having only one virtual mapping active at any point in time.
> 
> I don't think there's a clear "right" answer, so it's a judgment call.  We've explored
> what other approaches would look like, and I'd say let's go with
> (1) as the simpler approach.  Thoughts?
> 
I agree with the following goal:
"So the primary goals are an appropriate logical abstraction, code that is
     simple and straightforward, and efficient memory usage."

And the Approach #1 looks better to me as well.

Thanks,
- Haiyang


  reply	other threads:[~2021-11-25 21:58 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-23 14:30 [PATCH V2 0/6] x86/Hyper-V: Add Hyper-V Isolation VM support(Second part) Tianyu Lan
2021-11-23 14:30 ` Tianyu Lan
2021-11-23 14:30 ` [PATCH V2 1/6] Swiotlb: Add Swiotlb bounce buffer remap function for HV IVM Tianyu Lan
2021-11-23 14:30   ` Tianyu Lan
2021-11-23 17:15   ` Michael Kelley (LINUX)
2021-11-23 17:15     ` Michael Kelley (LINUX) via iommu
2021-11-24 14:07     ` Tianyu Lan
2021-11-24 14:07       ` Tianyu Lan
2021-11-23 14:30 ` [PATCH V2 2/6] dma-mapping: Add vmap/vunmap_noncontiguous() callback in dma ops Tianyu Lan
2021-11-23 14:30   ` Tianyu Lan
2021-11-23 14:30 ` [PATCH V2 3/6] x86/hyper-v: Add hyperv Isolation VM check in the cc_platform_has() Tianyu Lan
2021-11-23 14:30   ` Tianyu Lan
2021-11-23 14:30 ` [PATCH V2 4/6] hyperv/IOMMU: Enable swiotlb bounce buffer for Isolation VM Tianyu Lan
2021-11-23 14:30   ` Tianyu Lan
2021-11-23 17:44   ` Michael Kelley (LINUX)
2021-11-23 17:44     ` Michael Kelley (LINUX) via iommu
2021-11-23 14:30 ` [PATCH V2 5/6] net: netvsc: Add Isolation VM support for netvsc driver Tianyu Lan
2021-11-23 14:30   ` Tianyu Lan
2021-11-23 17:55   ` Michael Kelley (LINUX)
2021-11-23 17:55     ` Michael Kelley (LINUX) via iommu
2021-11-24 17:03   ` Michael Kelley (LINUX)
2021-11-24 17:03     ` Michael Kelley (LINUX) via iommu
2021-11-25 21:58     ` Haiyang Zhang via iommu [this message]
2021-11-25 21:58       ` Haiyang Zhang
2021-11-23 14:30 ` [PATCH V2 6/6] scsi: storvsc: Add Isolation VM support for storvsc driver Tianyu Lan
2021-11-23 14:30   ` Tianyu Lan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM6PR21MB12926C3BC4766C78C57D9210CA629@DM6PR21MB1292.namprd21.prod.outlook.com \
    --to=iommu@lists.linux-foundation.org \
    --cc=Tianyu.Lan@microsoft.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=brijesh.singh@amd.com \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=hch@lst.de \
    --cc=hpa@zytor.com \
    --cc=jejb@linux.ibm.com \
    --cc=jgross@suse.com \
    --cc=joro@8bytes.org \
    --cc=konrad.wilk@oracle.com \
    --cc=kuba@kernel.org \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=ltykernel@gmail.com \
    --cc=luto@kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=martin.petersen@oracle.com \
    --cc=mikelley@microsoft.com \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=parri.andrea@gmail.com \
    --cc=peterz@infradead.org \
    --cc=robin.murphy@arm.com \
    --cc=sstabellini@kernel.org \
    --cc=sthemmin@microsoft.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=vkuznets@redhat.com \
    --cc=wei.liu@kernel.org \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.