linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Dexuan Cui <decui@microsoft.com>
Cc: "ak@linux.intel.com" <ak@linux.intel.com>,
	"arnd@arndb.de" <arnd@arndb.de>, "bp@alien8.de" <bp@alien8.de>,
	"brijesh.singh@amd.com" <brijesh.singh@amd.com>,
	"Williams, Dan J" <dan.j.williams@intel.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"jane.chu@oracle.com" <jane.chu@oracle.com>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	KY Srinivasan <kys@microsoft.com>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"luto@kernel.org" <luto@kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"rostedt@goodmis.org" <rostedt@goodmis.org>,
	"sathyanarayanan.kuppuswamy@linux.intel.com" 
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	"seanjc@google.com" <seanjc@google.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/6] x86/tdx: Support vmalloc() for tdx_enc_status_changed()
Date: Thu, 24 Nov 2022 10:51:25 +0300	[thread overview]
Message-ID: <20221124075125.56cpbkmjyr26dzsn@box.shutemov.name> (raw)
In-Reply-To: <SA1PR21MB133536EA0C26DFE0168E2F98BF0C9@SA1PR21MB1335.namprd21.prod.outlook.com>

On Wed, Nov 23, 2022 at 11:51:11PM +0000, Dexuan Cui wrote:
> > From: Kirill A. Shutemov <kirill@shutemov.name>
> > Sent: Monday, November 21, 2022 4:24 PM
> > 
> > On Mon, Nov 21, 2022 at 11:51:48AM -0800, Dexuan Cui wrote:
> > > When a TDX guest runs on Hyper-V, the hv_netvsc driver's netvsc_init_buf()
> > > allocates buffers using vzalloc(), and needs to share the buffers with the
> > > host OS by calling set_memory_decrypted(), which is not working for
> > > vmalloc() yet. Add the support by handling the pages one by one.
> > 
> > Why do you use vmalloc here in the first place?
> 
> We changed to vmalloc() long ago, mainly for 2 reasons:
> 
> 1) __alloc_pages() only allows us to allocate up to 4MB of contiguous pages, but
> we need a 16MB buffer in the Hyper-V vNIC driver for better performance.
> 
> 2) Due to memory fragmentation, we have seen that the page allocator can fail
> to allocate 2 contigous pages, though the system has a lot of free memory. We
> need to support Hyper-V vNIC hot addition, so we changed to vmalloc. See
> 
> b679ef73edc2 ("hyperv: Add support for physically discontinuous receive buffer")
> 06b47aac4924 ("Drivers: net-next: hyperv: Increase the size of the sendbuf region")
> 
> > Will you also adjust direct mapping to have shared bit set?
> > 
> > If not, we will have problems with load_unaligned_zeropad() when it will
> > access shared pages via non-shared mapping.
> > 
> > If direct mapping is adjusted, we will get direct mapping fragmentation.
> 
> load_unaligned_zeropad() was added 10 years ago by Linus in
> e419b4cc5856 ("vfs: make word-at-a-time accesses handle a non-existing page") 
> so this seemingly-strange usage is legitimate.
> 
> Sorry I don't know how to adjust direct mapping. Do you mean I should do
> something like the below in tdx_enc_status_changed_for_vmalloc() for
> every 'start_va':
>   pa = slow_virt_to_phys(start_va);
>   set_memory_decrypted(phys_to_virt(pa), 1);
> ?
> 
> But IIRC this issue is not specific to vmalloc()? e.g. we get 1 page by
> __get_free_pages(GFP_KERNEL, 0) or kmalloc(PAGE_SIZE, GFP_KERNEL)
> and we call set_memory_decrypted() for the page. How can we make
> sure the callers of load_unaligned_zeropad() can't access the page
> via non-shared mapping?

__get_free_pages() and kmalloc() returns pointer to the page in the direct
mapping. set_memory_decrypted() adjust direct mapping to have the shared
bit set. Everything is fine.

> It looks like you have a patchset to address the issue (it looks like it
> hasn't been merged into the mainline?) ?
> https://lwn.net/ml/linux-kernel/20220614120231.48165-11-kirill.shutemov@linux.intel.com/

It addresses similar, but different issue. It is only relevant for
unaccepted memory support.

> BTW, I'll drop tdx_enc_status_changed_for_vmalloc() and try to enhance the
> existing tdx_enc_status() to support both direct mapping and vmalloc().
> 
> > Maybe tap into swiotlb buffer using DMA API?
> 
> I doubt the Hyper-V vNIC driver here can call dma_alloc_coherent() to
> get a 16MB buffer from swiotlb buffers. I'm looking at dma_alloc_coherent() ->
> dma_alloc_attrs() -> dma_direct_alloc(), which typically calls 
> __dma_direct_alloc_pages() to allocate congituous memory pages (which
> can't exceed the 4MB limit. Note there is no virtual IOMMU in a guest on Hyper-V).
> 
> It looks like we can't force dma_direct_alloc() to call dma_direct_alloc_no_mapping(),
> because even if we set the DMA_ATTR_NO_KERNEL_MAPPING flag,
> force_dma_unencrypted() is still always true for a TDX guest.

The point is not in reaching dma_direct_alloc_no_mapping(). The idea is
allocate from existing swiotlb that already has shared bit set in direct
mapping.

vmap area that maps pages allocated from swiotlb also should work fine.

To be honest, I don't understand DMA API well enough. I need to experiment
with it to see what works for the case.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

  reply	other threads:[~2022-11-24  7:51 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-21 19:51 [PATCH 0/6] Support TDX guests on Hyper-V Dexuan Cui
2022-11-21 19:51 ` [PATCH 1/6] x86/tdx: Support hypercalls for " Dexuan Cui
2022-11-21 20:38   ` Dave Hansen
2022-11-21 23:52     ` Kirill A. Shutemov
2022-11-23  1:37     ` Dexuan Cui
2022-11-23  1:56       ` Dexuan Cui
2022-11-23 16:04         ` Dave Hansen
2022-11-23 18:59           ` Dexuan Cui
2022-11-23  3:52       ` Sathyanarayanan Kuppuswamy
2022-11-23 14:40       ` Kirill A. Shutemov
2022-11-23 18:55         ` Dexuan Cui
2022-11-30 19:14           ` Dexuan Cui
2022-12-02 21:47             ` 'Kirill A. Shutemov'
2022-11-23 16:03       ` Dave Hansen
2022-11-21 19:51 ` [PATCH 2/6] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Dexuan Cui
2022-11-21 20:55   ` Dave Hansen
2022-11-23  2:55     ` Dexuan Cui
2022-11-22  0:01   ` Kirill A. Shutemov
2022-11-23  3:27     ` Dexuan Cui
2022-11-23 13:30       ` Michael Kelley (LINUX)
2022-11-28  0:07         ` Dexuan Cui
2022-11-21 19:51 ` [PATCH 3/6] x86/tdx: Support vmalloc() for tdx_enc_status_changed() Dexuan Cui
2022-11-21 21:00   ` Dave Hansen
2022-11-23  4:01     ` Dexuan Cui
2022-11-22  0:24   ` Kirill A. Shutemov
2022-11-23 23:51     ` Dexuan Cui
2022-11-24  7:51       ` Kirill A. Shutemov [this message]
2022-11-27 20:27         ` Dexuan Cui
2022-11-21 19:51 ` [PATCH 4/6] x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests Dexuan Cui
2022-11-21 21:01   ` Dave Hansen
2022-11-21 21:48     ` Borislav Petkov
2022-11-22  0:32   ` Sathyanarayanan Kuppuswamy
2022-11-23 19:13     ` Dexuan Cui
2022-11-21 19:51 ` [PATCH 5/6] x86/hyperv: Support hypercalls for " Dexuan Cui
2022-11-21 20:05   ` Dave Hansen
2022-11-23  2:14     ` Dexuan Cui
2022-11-23 14:47       ` Kirill A. Shutemov
2022-11-23 18:13         ` Dexuan Cui
2022-11-23 18:18         ` Sathyanarayanan Kuppuswamy
2022-11-23 19:07           ` Dexuan Cui
2022-11-23 14:45   ` Michael Kelley (LINUX)
2022-11-28  0:58     ` Dexuan Cui
2022-11-28  1:20       ` Michael Kelley (LINUX)
2022-11-28  1:36         ` Dexuan Cui
2022-11-28  1:21       ` Sathyanarayanan Kuppuswamy
2022-11-28  1:55         ` Dexuan Cui
2022-11-28 15:22       ` Dave Hansen
2022-11-28 19:03         ` Dexuan Cui
2022-11-28 19:11           ` Dave Hansen
2022-11-28 19:37             ` Dexuan Cui
2022-11-28 19:48               ` Dave Hansen
2022-11-28 20:36                 ` Dexuan Cui
2022-11-28 21:15                   ` Dave Hansen
2022-11-28 21:53                     ` Dexuan Cui
2022-11-21 19:51 ` [PATCH 6/6] Drivers: hv: vmbus: Support " Dexuan Cui
2023-01-06 11:00   ` Zhi Wang
2023-01-09  6:59     ` Dexuan Cui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221124075125.56cpbkmjyr26dzsn@box.shutemov.name \
    --to=kirill@shutemov.name \
    --cc=ak@linux.intel.com \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=brijesh.singh@amd.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=hpa@zytor.com \
    --cc=jane.chu@oracle.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kys@microsoft.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=wei.liu@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).