All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Kelley <mikelley@microsoft.com>
To: Christoph Hellwig <hch@lst.de>, Tianyu Lan <ltykernel@gmail.com>
Cc: KY Srinivasan <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	Dexuan Cui <decui@microsoft.com>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"will@kernel.org" <will@kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>, "x86@kernel.org" <x86@kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"luto@kernel.org" <luto@kernel.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"konrad.wilk@oracle.com" <konrad.wilk@oracle.com>,
	"boris.ostrovsky@oracle.com" <boris.ostrovsky@oracle.com>,
	"jgross@suse.com" <jgross@suse.com>,
	"sstabellini@kernel.org" <sstabellini@kernel.org>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"kuba@kernel.org" <kuba@kernel.org>,
	"jejb@linux.ibm.com" <jejb@linux.ibm.com>,
	"martin.petersen@oracle.com" <martin.petersen@oracle.com>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"arnd@arndb.de" <arnd@arndb.de>,
	"m.szyprowski@samsung.com" <m.szyprowski@samsung.com>,
	"robin.murphy@arm.com" <robin.murphy@arm.com>,
	"brijesh.singh@amd.com" <brijesh.singh@amd.com>,
	"thomas.lendacky@amd.com" <thomas.lendacky@amd.com>,
	Tianyu Lan <Tianyu.Lan@microsoft.com>,
	"pgonda@google.com" <pgonda@google.com>,
	"martin.b.radev@gmail.com" <martin.b.radev@gmail.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	"rppt@kernel.org" <rppt@kernel.org>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
	"aneesh.kumar@linux.ibm.com" <aneesh.kumar@linux.ibm.com>,
	"krish.sadhukhan@oracle.com" <krish.sadhukhan@oracle.com>,
	"saravanand@fb.com" <saravanand@fb.com>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	"rientjes@google.com" <rientjes@google.com>,
	"ardb@kernel.org" <ardb@kernel.org>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	vkuznets <vkuznets@redhat.com>,
	"parri.andrea@gmail.com" <parri.andrea@gmail.com>,
	"dave.hansen@intel.com" <dave.hansen@intel.com>
Subject: RE: [PATCH V4 00/13] x86/Hyper-V: Add Hyper-V Isolation VM support
Date: Tue, 31 Aug 2021 17:16:19 +0000	[thread overview]
Message-ID: <MWHPR21MB15933503E7C324167CB4132CD7CC9@MWHPR21MB1593.namprd21.prod.outlook.com> (raw)
In-Reply-To: <20210830120036.GA22005@lst.de>

From: Christoph Hellwig <hch@lst.de> Sent: Monday, August 30, 2021 5:01 AM
> 
> Sorry for the delayed answer, but I look at the vmap_pfn usage in the
> previous version and tried to come up with a better version.  This
> mostly untested branch:
> 
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/hyperv-vmap
> 
> get us there for swiotlb and the channel infrastructure  I've started
> looking at the network driver and didn't get anywhere due to other work.
> 
> As far as I can tell the network driver does gigantic multi-megabyte
> vmalloc allocation for the send and receive buffers, which are then
> passed to the hardware, but always copied to/from when interacting
> with the networking stack.  Did I see that right?  Are these big
> buffers actually required unlike the normal buffer management schemes
> in other Linux network drivers?
> 
> If so I suspect the best way to allocate them is by not using vmalloc
> but just discontiguous pages, and then use kmap_local_pfn where the
> PFN includes the share_gpa offset when actually copying from/to the
> skbs.

As a quick overview, I think there are four places where the
shared_gpa_boundary must be applied to adjust the guest physical
address that is used.  Each requires mapping a corresponding
virtual address range.  Here are the four places:

1)  The so-called "monitor pages" that are a core communication
mechanism between the guest and Hyper-V.  These are two single
pages, and the mapping is handled by calling memremap() for
each of the two pages.  See Patch 7 of Tianyu's series.

2)  The VMbus channel ring buffers.  You have proposed using
your new  vmap_phys_range() helper, but I don't think that works
here.  More details below.

3)  The network driver send and receive buffers.  vmap_phys_range()
should work here.

4) The swiotlb memory used for bounce buffers.  vmap_phys_range()
should work here as well.

Case #2 above does unusual mapping.  The ring buffer consists of a ring
buffer header page, followed by one or more pages that are the actual
ring buffer.  The pages making up the actual ring buffer are mapped
twice in succession.  For example, if the ring buffer has 4 pages
(one header page and three ring buffer pages), the contiguous
virtual mapping must cover these seven pages:  0, 1, 2, 3, 1, 2, 3.
The duplicate contiguous mapping allows the code that is reading
or writing the actual ring buffer to not be concerned about wrap-around
because writing off the end of the ring buffer is automatically
wrapped-around by the mapping.  The amount of data read or
written in one batch never exceeds the size of the ring buffer, and
after a batch is read or written, the read or write indices are adjusted
to put them back into the range of the first mapping of the actual
ring buffer pages.  So there's method to the madness, and the
technique works pretty well.  But this kind of mapping is not
amenable to using vmap_phys_range().

Michael



WARNING: multiple messages have this Message-ID (diff)
From: Michael Kelley via iommu <iommu@lists.linux-foundation.org>
To: Christoph Hellwig <hch@lst.de>, Tianyu Lan <ltykernel@gmail.com>
Cc: "parri.andrea@gmail.com" <parri.andrea@gmail.com>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"brijesh.singh@amd.com" <brijesh.singh@amd.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"dave.hansen@intel.com" <dave.hansen@intel.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	KY Srinivasan <kys@microsoft.com>,
	"will@kernel.org" <will@kernel.org>,
	"boris.ostrovsky@oracle.com" <boris.ostrovsky@oracle.com>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	"sstabellini@kernel.org" <sstabellini@kernel.org>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"aneesh.kumar@linux.ibm.com" <aneesh.kumar@linux.ibm.com>,
	"x86@kernel.org" <x86@kernel.org>,
	Dexuan Cui <decui@microsoft.com>,
	"ardb@kernel.org" <ardb@kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"pgonda@google.com" <pgonda@google.com>,
	"rientjes@google.com" <rientjes@google.com>,
	"kuba@kernel.org" <kuba@kernel.org>,
	"jejb@linux.ibm.com" <jejb@linux.ibm.com>,
	"martin.b.radev@gmail.com" <martin.b.radev@gmail.com>,
	"thomas.lendacky@amd.com" <thomas.lendacky@amd.com>,
	Tianyu Lan <Tianyu.Lan@microsoft.com>,
	"arnd@arndb.de" <arnd@arndb.de>,
	"konrad.wilk@oracle.com" <konrad.wilk@oracle.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"luto@kernel.org" <luto@kernel.org>,
	"krish.sadhukhan@oracle.com" <krish.sadhukhan@oracle.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	vkuznets <vkuznets@redhat.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"jgross@suse.com" <jgross@suse.com>,
	"martin.petersen@oracle.com" <martin.petersen@oracle.com>,
	"saravanand@fb.com" <saravanand@fb.com>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	"rppt@kernel.org" <rppt@kernel.org>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"robin.murphy@arm.com" <robin.murphy@arm.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>
Subject: RE: [PATCH V4 00/13] x86/Hyper-V: Add Hyper-V Isolation VM support
Date: Tue, 31 Aug 2021 17:16:19 +0000	[thread overview]
Message-ID: <MWHPR21MB15933503E7C324167CB4132CD7CC9@MWHPR21MB1593.namprd21.prod.outlook.com> (raw)
In-Reply-To: <20210830120036.GA22005@lst.de>

From: Christoph Hellwig <hch@lst.de> Sent: Monday, August 30, 2021 5:01 AM
> 
> Sorry for the delayed answer, but I look at the vmap_pfn usage in the
> previous version and tried to come up with a better version.  This
> mostly untested branch:
> 
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/hyperv-vmap
> 
> get us there for swiotlb and the channel infrastructure  I've started
> looking at the network driver and didn't get anywhere due to other work.
> 
> As far as I can tell the network driver does gigantic multi-megabyte
> vmalloc allocation for the send and receive buffers, which are then
> passed to the hardware, but always copied to/from when interacting
> with the networking stack.  Did I see that right?  Are these big
> buffers actually required unlike the normal buffer management schemes
> in other Linux network drivers?
> 
> If so I suspect the best way to allocate them is by not using vmalloc
> but just discontiguous pages, and then use kmap_local_pfn where the
> PFN includes the share_gpa offset when actually copying from/to the
> skbs.

As a quick overview, I think there are four places where the
shared_gpa_boundary must be applied to adjust the guest physical
address that is used.  Each requires mapping a corresponding
virtual address range.  Here are the four places:

1)  The so-called "monitor pages" that are a core communication
mechanism between the guest and Hyper-V.  These are two single
pages, and the mapping is handled by calling memremap() for
each of the two pages.  See Patch 7 of Tianyu's series.

2)  The VMbus channel ring buffers.  You have proposed using
your new  vmap_phys_range() helper, but I don't think that works
here.  More details below.

3)  The network driver send and receive buffers.  vmap_phys_range()
should work here.

4) The swiotlb memory used for bounce buffers.  vmap_phys_range()
should work here as well.

Case #2 above does unusual mapping.  The ring buffer consists of a ring
buffer header page, followed by one or more pages that are the actual
ring buffer.  The pages making up the actual ring buffer are mapped
twice in succession.  For example, if the ring buffer has 4 pages
(one header page and three ring buffer pages), the contiguous
virtual mapping must cover these seven pages:  0, 1, 2, 3, 1, 2, 3.
The duplicate contiguous mapping allows the code that is reading
or writing the actual ring buffer to not be concerned about wrap-around
because writing off the end of the ring buffer is automatically
wrapped-around by the mapping.  The amount of data read or
written in one batch never exceeds the size of the ring buffer, and
after a batch is read or written, the read or write indices are adjusted
to put them back into the range of the first mapping of the actual
ring buffer pages.  So there's method to the madness, and the
technique works pretty well.  But this kind of mapping is not
amenable to using vmap_phys_range().

Michael


_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  parent reply	other threads:[~2021-08-31 17:16 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-27 17:20 [PATCH V4 00/13] x86/Hyper-V: Add Hyper-V Isolation VM support Tianyu Lan
2021-08-27 17:20 ` Tianyu Lan
2021-08-27 17:20 ` [PATCH V4 01/13] x86/hyperv: Initialize GHCB page in Isolation VM Tianyu Lan
2021-08-27 17:20   ` Tianyu Lan
2021-09-02  0:15   ` Michael Kelley
2021-09-02  0:15     ` Michael Kelley via iommu
2021-09-02  0:15     ` Michael Kelley
2021-08-27 17:21 ` [PATCH V4 02/13] x86/hyperv: Initialize shared memory boundary in the " Tianyu Lan
2021-08-27 17:21   ` Tianyu Lan
2021-09-02  0:15   ` Michael Kelley
2021-09-02  0:15     ` Michael Kelley
2021-09-02  0:15     ` Michael Kelley via iommu
2021-09-02  6:35     ` Tianyu Lan
2021-09-02  6:35       ` Tianyu Lan
2021-09-02  6:35       ` Tianyu Lan
2021-08-27 17:21 ` [PATCH V4 03/13] x86/hyperv: Add new hvcall guest address host visibility support Tianyu Lan
2021-08-27 17:21   ` Tianyu Lan
2021-09-02  0:16   ` Michael Kelley
2021-09-02  0:16     ` Michael Kelley
2021-09-02  0:16     ` Michael Kelley via iommu
2021-08-27 17:21 ` [PATCH V4 04/13] hyperv: Mark vmbus ring buffer visible to host in Isolation VM Tianyu Lan
2021-08-27 17:21   ` Tianyu Lan
2021-08-27 17:41   ` Greg KH
2021-08-27 17:41     ` Greg KH
2021-08-27 17:44     ` Tianyu Lan
2021-08-27 17:44       ` Tianyu Lan
2021-09-02  0:17   ` Michael Kelley
2021-09-02  0:17     ` Michael Kelley
2021-09-02  0:17     ` Michael Kelley via iommu
2021-08-27 17:21 ` [PATCH V4 05/13] hyperv: Add Write/Read MSR registers via ghcb page Tianyu Lan
2021-08-27 17:21   ` Tianyu Lan
2021-08-27 17:41   ` Greg KH
2021-08-27 17:41     ` Greg KH
2021-08-27 17:46     ` Tianyu Lan
2021-08-27 17:46       ` Tianyu Lan
2021-09-02  3:32   ` Michael Kelley
2021-09-02  3:32     ` Michael Kelley
2021-09-02  3:32     ` Michael Kelley via iommu
2021-08-27 17:21 ` [PATCH V4 06/13] hyperv: Add ghcb hvcall support for SNP VM Tianyu Lan
2021-08-27 17:21   ` Tianyu Lan
2021-09-02  0:20   ` Michael Kelley
2021-09-02  0:20     ` Michael Kelley
2021-09-02  0:20     ` Michael Kelley via iommu
2021-08-27 17:21 ` [PATCH V4 07/13] hyperv/Vmbus: Add SNP support for VMbus channel initiate message Tianyu Lan
2021-08-27 17:21   ` Tianyu Lan
2021-09-02  0:21   ` Michael Kelley
2021-09-02  0:21     ` Michael Kelley
2021-09-02  0:21     ` Michael Kelley via iommu
2021-08-27 17:21 ` [PATCH V4 08/13] hyperv/vmbus: Initialize VMbus ring buffer for Isolation VM Tianyu Lan
2021-08-27 17:21   ` Tianyu Lan
2021-09-02  0:23   ` Michael Kelley
2021-09-02  0:23     ` Michael Kelley
2021-09-02  0:23     ` Michael Kelley via iommu
2021-09-02 13:35     ` Tianyu Lan
2021-09-02 13:35       ` Tianyu Lan
2021-09-02 13:35       ` Tianyu Lan
2021-09-02 16:14       ` Michael Kelley
2021-09-02 16:14         ` Michael Kelley via iommu
2021-09-02 16:14         ` Michael Kelley
2021-08-27 17:21 ` [PATCH V4 09/13] DMA: Add dma_map_decrypted/dma_unmap_encrypted() function Tianyu Lan
2021-08-27 17:21   ` Tianyu Lan
2021-08-27 17:21 ` [PATCH V4 10/13] x86/Swiotlb: Add Swiotlb bounce buffer remap function for HV IVM Tianyu Lan
2021-08-27 17:21   ` Tianyu Lan
2021-08-27 17:21 ` [PATCH V4 11/13] hyperv/IOMMU: Enable swiotlb bounce buffer for Isolation VM Tianyu Lan
2021-08-27 17:21   ` Tianyu Lan
2021-09-02  1:27   ` Michael Kelley
2021-09-02  1:27     ` Michael Kelley
2021-09-02  1:27     ` Michael Kelley via iommu
2021-08-27 17:21 ` [PATCH V4 12/13] hv_netvsc: Add Isolation VM support for netvsc driver Tianyu Lan
2021-08-27 17:21   ` Tianyu Lan
2021-09-02  2:34   ` Michael Kelley
2021-09-02  2:34     ` Michael Kelley
2021-09-02  2:34     ` Michael Kelley via iommu
2021-09-02  4:56     ` Michael Kelley
2021-09-02  4:56       ` Michael Kelley
2021-09-02  4:56       ` Michael Kelley via iommu
2021-08-27 17:21 ` [PATCH V4 13/13] hv_storvsc: Add Isolation VM support for storvsc driver Tianyu Lan
2021-08-27 17:21   ` Tianyu Lan
2021-09-02  2:08   ` Michael Kelley
2021-09-02  2:08     ` Michael Kelley
2021-09-02  2:08     ` Michael Kelley via iommu
2021-08-30 12:00 ` [PATCH V4 00/13] x86/Hyper-V: Add Hyper-V Isolation VM support Christoph Hellwig
2021-08-30 12:00   ` Christoph Hellwig
2021-08-31 15:20   ` Tianyu Lan
2021-08-31 15:20     ` Tianyu Lan
2021-09-02  7:51     ` Christoph Hellwig
2021-09-02  7:51       ` Christoph Hellwig
2021-08-31 17:16   ` Michael Kelley [this message]
2021-08-31 17:16     ` Michael Kelley
2021-08-31 17:16     ` Michael Kelley via iommu
2021-09-02  7:59     ` Christoph Hellwig
2021-09-02  7:59       ` Christoph Hellwig
2021-09-02  7:59       ` Christoph Hellwig
2021-09-02 11:21       ` Tianyu Lan
2021-09-02 11:21         ` Tianyu Lan
2021-09-02 11:21         ` Tianyu Lan
2021-09-02 15:57       ` Michael Kelley
2021-09-02 15:57         ` Michael Kelley
2021-09-02 15:57         ` Michael Kelley via iommu
2021-09-14 14:41         ` Tianyu Lan
2021-09-14 14:41           ` Tianyu Lan
2021-09-14 14:41           ` Tianyu Lan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MWHPR21MB15933503E7C324167CB4132CD7CC9@MWHPR21MB1593.namprd21.prod.outlook.com \
    --to=mikelley@microsoft.com \
    --cc=Tianyu.Lan@microsoft.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=brijesh.singh@amd.com \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=decui@microsoft.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=haiyangz@microsoft.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@lst.de \
    --cc=hpa@zytor.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jejb@linux.ibm.com \
    --cc=jgross@suse.com \
    --cc=joro@8bytes.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=konrad.wilk@oracle.com \
    --cc=krish.sadhukhan@oracle.com \
    --cc=kuba@kernel.org \
    --cc=kys@microsoft.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=ltykernel@gmail.com \
    --cc=luto@kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=martin.b.radev@gmail.com \
    --cc=martin.petersen@oracle.com \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=parri.andrea@gmail.com \
    --cc=peterz@infradead.org \
    --cc=pgonda@google.com \
    --cc=rientjes@google.com \
    --cc=robin.murphy@arm.com \
    --cc=rppt@kernel.org \
    --cc=saravanand@fb.com \
    --cc=sstabellini@kernel.org \
    --cc=sthemmin@microsoft.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=vkuznets@redhat.com \
    --cc=wei.liu@kernel.org \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.