linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dexuan Cui <decui@microsoft.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: "ak@linux.intel.com" <ak@linux.intel.com>,
	"arnd@arndb.de" <arnd@arndb.de>, "bp@alien8.de" <bp@alien8.de>,
	"brijesh.singh@amd.com" <brijesh.singh@amd.com>,
	"Williams, Dan J" <dan.j.williams@intel.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"jane.chu@oracle.com" <jane.chu@oracle.com>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	KY Srinivasan <kys@microsoft.com>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"luto@kernel.org" <luto@kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"rostedt@goodmis.org" <rostedt@goodmis.org>,
	"sathyanarayanan.kuppuswamy@linux.intel.com" 
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	"seanjc@google.com" <seanjc@google.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH 3/6] x86/tdx: Support vmalloc() for tdx_enc_status_changed()
Date: Wed, 23 Nov 2022 23:51:11 +0000	[thread overview]
Message-ID: <SA1PR21MB133536EA0C26DFE0168E2F98BF0C9@SA1PR21MB1335.namprd21.prod.outlook.com> (raw)
In-Reply-To: <20221122002421.qg4h47cjoc2birvb@box.shutemov.name>

> From: Kirill A. Shutemov <kirill@shutemov.name>
> Sent: Monday, November 21, 2022 4:24 PM
> 
> On Mon, Nov 21, 2022 at 11:51:48AM -0800, Dexuan Cui wrote:
> > When a TDX guest runs on Hyper-V, the hv_netvsc driver's netvsc_init_buf()
> > allocates buffers using vzalloc(), and needs to share the buffers with the
> > host OS by calling set_memory_decrypted(), which is not working for
> > vmalloc() yet. Add the support by handling the pages one by one.
> 
> Why do you use vmalloc here in the first place?

We changed to vmalloc() long ago, mainly for 2 reasons:

1) __alloc_pages() only allows us to allocate up to 4MB of contiguous pages, but
we need a 16MB buffer in the Hyper-V vNIC driver for better performance.

2) Due to memory fragmentation, we have seen that the page allocator can fail
to allocate 2 contigous pages, though the system has a lot of free memory. We
need to support Hyper-V vNIC hot addition, so we changed to vmalloc. See

b679ef73edc2 ("hyperv: Add support for physically discontinuous receive buffer")
06b47aac4924 ("Drivers: net-next: hyperv: Increase the size of the sendbuf region")

> Will you also adjust direct mapping to have shared bit set?
> 
> If not, we will have problems with load_unaligned_zeropad() when it will
> access shared pages via non-shared mapping.
> 
> If direct mapping is adjusted, we will get direct mapping fragmentation.

load_unaligned_zeropad() was added 10 years ago by Linus in
e419b4cc5856 ("vfs: make word-at-a-time accesses handle a non-existing page") 
so this seemingly-strange usage is legitimate.

Sorry I don't know how to adjust direct mapping. Do you mean I should do
something like the below in tdx_enc_status_changed_for_vmalloc() for
every 'start_va':
  pa = slow_virt_to_phys(start_va);
  set_memory_decrypted(phys_to_virt(pa), 1);
?

But IIRC this issue is not specific to vmalloc()? e.g. we get 1 page by
__get_free_pages(GFP_KERNEL, 0) or kmalloc(PAGE_SIZE, GFP_KERNEL)
and we call set_memory_decrypted() for the page. How can we make
sure the callers of load_unaligned_zeropad() can't access the page
via non-shared mapping?

It looks like you have a patchset to address the issue (it looks like it
hasn't been merged into the mainline?) ?
https://lwn.net/ml/linux-kernel/20220614120231.48165-11-kirill.shutemov@linux.intel.com/

BTW, I'll drop tdx_enc_status_changed_for_vmalloc() and try to enhance the
existing tdx_enc_status() to support both direct mapping and vmalloc().

> Maybe tap into swiotlb buffer using DMA API?

I doubt the Hyper-V vNIC driver here can call dma_alloc_coherent() to
get a 16MB buffer from swiotlb buffers. I'm looking at dma_alloc_coherent() ->
dma_alloc_attrs() -> dma_direct_alloc(), which typically calls 
__dma_direct_alloc_pages() to allocate congituous memory pages (which
can't exceed the 4MB limit. Note there is no virtual IOMMU in a guest on Hyper-V).

It looks like we can't force dma_direct_alloc() to call dma_direct_alloc_no_mapping(),
because even if we set the DMA_ATTR_NO_KERNEL_MAPPING flag,
force_dma_unencrypted() is still always true for a TDX guest.

  reply	other threads:[~2022-11-23 23:51 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-21 19:51 [PATCH 0/6] Support TDX guests on Hyper-V Dexuan Cui
2022-11-21 19:51 ` [PATCH 1/6] x86/tdx: Support hypercalls for " Dexuan Cui
2022-11-21 20:38   ` Dave Hansen
2022-11-21 23:52     ` Kirill A. Shutemov
2022-11-23  1:37     ` Dexuan Cui
2022-11-23  1:56       ` Dexuan Cui
2022-11-23 16:04         ` Dave Hansen
2022-11-23 18:59           ` Dexuan Cui
2022-11-23  3:52       ` Sathyanarayanan Kuppuswamy
2022-11-23 14:40       ` Kirill A. Shutemov
2022-11-23 18:55         ` Dexuan Cui
2022-11-30 19:14           ` Dexuan Cui
2022-12-02 21:47             ` 'Kirill A. Shutemov'
2022-11-23 16:03       ` Dave Hansen
2022-11-21 19:51 ` [PATCH 2/6] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Dexuan Cui
2022-11-21 20:55   ` Dave Hansen
2022-11-23  2:55     ` Dexuan Cui
2022-11-22  0:01   ` Kirill A. Shutemov
2022-11-23  3:27     ` Dexuan Cui
2022-11-23 13:30       ` Michael Kelley (LINUX)
2022-11-28  0:07         ` Dexuan Cui
2022-11-21 19:51 ` [PATCH 3/6] x86/tdx: Support vmalloc() for tdx_enc_status_changed() Dexuan Cui
2022-11-21 21:00   ` Dave Hansen
2022-11-23  4:01     ` Dexuan Cui
2022-11-22  0:24   ` Kirill A. Shutemov
2022-11-23 23:51     ` Dexuan Cui [this message]
2022-11-24  7:51       ` Kirill A. Shutemov
2022-11-27 20:27         ` Dexuan Cui
2022-11-21 19:51 ` [PATCH 4/6] x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests Dexuan Cui
2022-11-21 21:01   ` Dave Hansen
2022-11-21 21:48     ` Borislav Petkov
2022-11-22  0:32   ` Sathyanarayanan Kuppuswamy
2022-11-23 19:13     ` Dexuan Cui
2022-11-21 19:51 ` [PATCH 5/6] x86/hyperv: Support hypercalls for " Dexuan Cui
2022-11-21 20:05   ` Dave Hansen
2022-11-23  2:14     ` Dexuan Cui
2022-11-23 14:47       ` Kirill A. Shutemov
2022-11-23 18:13         ` Dexuan Cui
2022-11-23 18:18         ` Sathyanarayanan Kuppuswamy
2022-11-23 19:07           ` Dexuan Cui
2022-11-23 14:45   ` Michael Kelley (LINUX)
2022-11-28  0:58     ` Dexuan Cui
2022-11-28  1:20       ` Michael Kelley (LINUX)
2022-11-28  1:36         ` Dexuan Cui
2022-11-28  1:21       ` Sathyanarayanan Kuppuswamy
2022-11-28  1:55         ` Dexuan Cui
2022-11-28 15:22       ` Dave Hansen
2022-11-28 19:03         ` Dexuan Cui
2022-11-28 19:11           ` Dave Hansen
2022-11-28 19:37             ` Dexuan Cui
2022-11-28 19:48               ` Dave Hansen
2022-11-28 20:36                 ` Dexuan Cui
2022-11-28 21:15                   ` Dave Hansen
2022-11-28 21:53                     ` Dexuan Cui
2022-11-21 19:51 ` [PATCH 6/6] Drivers: hv: vmbus: Support " Dexuan Cui
2023-01-06 11:00   ` Zhi Wang
2023-01-09  6:59     ` Dexuan Cui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SA1PR21MB133536EA0C26DFE0168E2F98BF0C9@SA1PR21MB1335.namprd21.prod.outlook.com \
    --to=decui@microsoft.com \
    --cc=ak@linux.intel.com \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=brijesh.singh@amd.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=haiyangz@microsoft.com \
    --cc=hpa@zytor.com \
    --cc=jane.chu@oracle.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=kys@microsoft.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=wei.liu@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).