linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Koenig, Christian" <Christian.Koenig@amd.com>
To: "Thomas Hellström (VMware)" <thomas_os@shipmail.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"pv-drivers@vmware.com" <pv-drivers@vmware.com>,
	Thomas Hellstrom <thellstrom@vmware.com>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>
Subject: Re: TTM huge page-faults WAS: Re: [RFC PATCH 1/2] x86: Don't let pgprot_modify() change the page encryption bit
Date: Wed, 11 Sep 2019 14:06:03 +0000	[thread overview]
Message-ID: <ea486b68-7751-e51f-5a59-7e1f145820f4@amd.com> (raw)
In-Reply-To: <cace2653-447f-bcdc-2714-142d9dc85787@shipmail.org>

Am 11.09.19 um 12:10 schrieb Thomas Hellström (VMware):
[SNIP]
>>> The problem seen in TTM is that we want to be able to change the
>>> vm_page_prot from the fault handler, but it's problematic since we
>>> have the mmap_sem typically only in read mode. Hence the fake vma
>>> hack. From what I can tell it's reasonably well-behaved, since
>>> pte_modify() skips the bits TTM updates, so mprotect() and mremap()
>>> works OK. I think split_huge_pmd may run into trouble, but we don't
>>> support it (yet) with TTM.
>> Ah! I actually ran into this while implementing huge page support for
>> TTM and never figured out why that doesn't work. Dropped CPU huge page
>> support because of this.
>
> By incident, I got slightly sidetracked the other day and started 
> looking at this as well. Got to the point where I figured out all the 
> hairy alignment issues and actually got huge_fault() calls, but never 
> implemented the handler. I think that's definitely something worth 
> having. Not sure it will work for IO memory, though, (split_huge_pmd 
> will just skip non-page-backed memory) but if we only support 
> VM_SHARED (non COW) vmas there's no reason to split the huge pmds 
> anyway. Definitely something we should have IMO.

Well our primary use case would be IO memory, cause system memory is 
only optionally allocate as huge page but we nearly always allocate VRAM 
in chunks of at least 2MB because we otherwise get a huge performance 
penalty.

>>> We could probably get away with a WRITE_ONCE() update of the
>>> vm_page_prot before taking the page table lock since
>>>
>>> a) We're locking out all other writers.
>>> b) We can't race with another fault to the same vma since we hold an
>>> address space lock ("buffer object reservation")
>>> c) When we need to update there are no valid page table entries in the
>>> vma, since it only happens directly after mmap(), or after an
>>> unmap_mapping_range() with the same address space lock. When another
>>> reader (for example split_huge_pmd()) sees a valid page table entry,
>>> it also sees the new page protection and things are fine.
>> Yeah, that's exactly why I always wondered why we need this hack with
>> the vma copy on the stack.
>>
>>> But that would really be a special case. To solve this properly we'd
>>> probably need an additional lock to protect the vm_flags and
>>> vm_page_prot, taken after mmap_sem and i_mmap_lock.
>> Well we already have a special lock for this: The reservation object. So
>> memory barriers etc should be in place and I also think we can just
>> update the vm_page_prot on the fly.
>
> I agree. This is needed for huge pages. We should make this change, 
> and perhaps add the justification above as a comment.

Alternatively we could introduce a new VM_* flag telling users of 
vm_page_prot to just let the pages table entries be filled by faults again.

Christian.

  reply	other threads:[~2019-09-11 14:06 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-05 10:35 [RFC PATCH 0/2] Fix SEV user-space mapping of unencrypted coherent memory Thomas Hellström (VMware)
2019-09-05 10:35 ` [RFC PATCH 1/2] x86: Don't let pgprot_modify() change the page encryption bit Thomas Hellström (VMware)
2019-09-05 14:15   ` Dave Hansen
2019-09-05 15:21     ` Thomas Hellström (VMware)
2019-09-05 15:24       ` Christoph Hellwig
2019-09-05 16:40         ` Thomas Hellström (VMware)
2019-09-05 17:05         ` dma_mmap_fault discussion Thomas Hellström (VMware)
2019-09-06  6:32           ` Christoph Hellwig
2019-09-06  7:10             ` Thomas Hellström (VMware)
2019-09-06  7:20               ` Christoph Hellwig
2019-09-10  8:37                 ` Thomas Hellström (VMware)
2019-09-10 16:11         ` [RFC PATCH 1/2] x86: Don't let pgprot_modify() change the page encryption bit Andy Lutomirski
2019-09-10 19:26           ` Thomas Hellström (VMware)
2019-09-11  4:18             ` Andy Lutomirski
2019-09-11  7:49               ` Thomas Hellström (VMware)
2019-09-11 18:03                 ` Andy Lutomirski
2019-09-12  8:29                   ` Thomas Hellström (VMware)
2019-09-11  9:08             ` Koenig, Christian
2019-09-11 10:10               ` TTM huge page-faults WAS: " Thomas Hellström (VMware)
2019-09-11 14:06                 ` Koenig, Christian [this message]
2019-09-11 15:08                   ` Thomas Hellström (VMware)
2019-09-24 12:03                     ` Koenig, Christian
2019-09-05 15:59       ` Dave Hansen
2019-09-05 16:29         ` Thomas Hellström (VMware)
2019-09-05 10:35 ` [RFC PATCH 2/2] dma-mapping: Fix dma_pgprot() for unencrypted coherent pages Thomas Hellström (VMware)
2019-09-05 11:23 ` [RFC PATCH 0/2] Fix SEV user-space mapping of unencrypted coherent memory Christoph Hellwig
2019-09-10  6:11   ` Christoph Hellwig
2019-09-10  6:25     ` Thomas Hellström (VMware)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ea486b68-7751-e51f-5a59-7e1f145820f4@amd.com \
    --to=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pv-drivers@vmware.com \
    --cc=thellstrom@vmware.com \
    --cc=thomas_os@shipmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).