All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
To: Christoffer Dall <christoffer.dall@linaro.org>
Cc: KVM devel mailing list <kvm@vger.kernel.org>,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
	Marc Zyngier <marc.zyngier@arm.com>,
	p.fedin@samsung.com
Subject: Re: [PATCH v2] ARM/arm64: KVM: test properly for a PTE's uncachedness
Date: Tue, 10 Nov 2015 14:48:01 +0100	[thread overview]
Message-ID: <CAKv+Gu9kyASJfLhO8GS3fDxSNaWSWPwRxHJ6MNv5EPjTZVD7OQ@mail.gmail.com> (raw)
In-Reply-To: <20151110134048.GE12968@cbox>

On 10 November 2015 at 14:40, Christoffer Dall
<christoffer.dall@linaro.org> wrote:
> On Tue, Nov 10, 2015 at 02:15:45PM +0100, Ard Biesheuvel wrote:
>> On 10 November 2015 at 13:22, Christoffer Dall
>> <christoffer.dall@linaro.org> wrote:
>> > On Tue, Nov 10, 2015 at 10:45:37AM +0100, Ard Biesheuvel wrote:
>> >> Hi all,
>> >>
>> >> I wonder if this is a better way to address the problem. It looks at
>> >> the nature of the memory rather than the nature of the mapping, which
>> >> is probably a more reliable indicator of whether cache maintenance is
>> >> required when performing the unmap.
>> >>
>> >>
>> >> -----------8<----------------
>> >> The open coded tests for checking whether a PTE maps a page as
>> >> uncached use a flawed 'pte_val(xxx) & CONST != CONST' pattern,
>> >> which is not guaranteed to work since the type of a mapping is
>> >> not a set of mutually exclusive bits
>> >>
>> >> For HYP mappings, the type is an index into the MAIR table (i.e, the
>> >> index itself does not contain any information whatsoever about the
>> >> type of the mapping), and for stage-2 mappings it is a bit field where
>> >> normal memory and device types are defined as follows:
>> >>
>> >>     #define MT_S2_NORMAL            0xf
>> >>     #define MT_S2_DEVICE_nGnRE      0x1
>> >>
>> >> I.e., masking *and* comparing with the latter matches on the former,
>> >> and we have been getting lucky merely because the S2 device mappings
>> >> also have the PTE_UXN bit set, or we would misidentify memory mappings
>> >> as device mappings.
>> >>
>> >> Since the unmap_range() code path (which contains one instance of the
>> >> flawed test) is used both for HYP mappings and stage-2 mappings, and
>> >> considering the difference between the two, it is non-trivial to fix
>> >> this by rewriting the tests in place, as it would involve passing
>> >> down the type of mapping through all the functions.
>> >>
>> >> However, since HYP mappings and stage-2 mappings both deal with host
>> >> physical addresses, we can simply check whether the mapping is backed
>> >> by memory that is managed by the host kernel, and only perform the
>> >> D-cache maintenance if this is the case.
>> >>
>> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> >> ---
>> >>  arch/arm/kvm/mmu.c | 15 +++++++--------
>> >>  1 file changed, 7 insertions(+), 8 deletions(-)
>> >>
>> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> >> index 6984342da13d..7dace909d5cf 100644
>> >> --- a/arch/arm/kvm/mmu.c
>> >> +++ b/arch/arm/kvm/mmu.c
>> >> @@ -98,6 +98,11 @@ static void kvm_flush_dcache_pud(pud_t pud)
>> >>       __kvm_flush_dcache_pud(pud);
>> >>  }
>> >>
>> >> +static bool kvm_is_device_pfn(unsigned long pfn)
>> >> +{
>> >> +     return !pfn_valid(pfn);
>> >> +}
>> >> +
>> >>  /**
>> >>   * stage2_dissolve_pmd() - clear and flush huge PMD entry
>> >>   * @kvm:     pointer to kvm structure.
>> >> @@ -213,7 +218,7 @@ static void unmap_ptes(struct kvm *kvm, pmd_t *pmd,
>> >>                       kvm_tlb_flush_vmid_ipa(kvm, addr);
>> >>
>> >>                       /* No need to invalidate the cache for device mappings */
>> >> -                     if ((pte_val(old_pte) & PAGE_S2_DEVICE) != PAGE_S2_DEVICE)
>> >> +                     if (!kvm_is_device_pfn(__phys_to_pfn(addr)))
>> >>                               kvm_flush_dcache_pte(old_pte);
>> >>
>> >>                       put_page(virt_to_page(pte));
>> >> @@ -305,8 +310,7 @@ static void stage2_flush_ptes(struct kvm *kvm, pmd_t *pmd,
>> >>
>> >>       pte = pte_offset_kernel(pmd, addr);
>> >>       do {
>> >> -             if (!pte_none(*pte) &&
>> >> -                 (pte_val(*pte) & PAGE_S2_DEVICE) != PAGE_S2_DEVICE)
>> >> +             if (!pte_none(*pte) && !kvm_is_device_pfn(__phys_to_pfn(addr)))
>> >>                       kvm_flush_dcache_pte(*pte);
>> >>       } while (pte++, addr += PAGE_SIZE, addr != end);
>> >>  }
>> >> @@ -1037,11 +1041,6 @@ static bool kvm_is_write_fault(struct kvm_vcpu *vcpu)
>> >>       return kvm_vcpu_dabt_iswrite(vcpu);
>> >>  }
>> >>
>> >> -static bool kvm_is_device_pfn(unsigned long pfn)
>> >> -{
>> >> -     return !pfn_valid(pfn);
>> >> -}
>> >> -
>> >>  /**
>> >>   * stage2_wp_ptes - write protect PMD range
>> >>   * @pmd:     pointer to pmd entry
>> >> --
>> >> 1.9.1
>> >>
>> >
>> > So PAGE_HYP_DEVICE is used only to map the vgic-v2 regions and
>> > PAGE_S2_DEVICE is used to map the vgic VCPU interface and for all memory
>> > regions where the vma has (vm_flags & VM_PFNMAP).
>> >
>> > Will these, and only these, cases be covered by the pfn_valid check?
>> >
>>
>> The pfn_valid() check will ensure that cache maintenance is only
>> performed on regions that are known to the host as memory, are managed
>> by the host (i.e., there is a struct page associated with them) and
>> will be accessed by the host via cacheable mappings (they are covered
>> by the linear mapping, or [on ARM] will be kmap'ed cacheable if they
>> are highmem). If you look at the commit that introduced these tests
>> (363ef89f8e9b arm/arm64: KVM: Invalidate data cache on unmap), the
>> concern it addresses is that the guest may perform uncached accesses
>> to regions that the host has mapped cacheable, meaning guest writes
>> may be shadowed by clean cachelines, making them invisble to cache
>> coherent I/O. So afaict, pfn_valid() is an appropriate check here.
>
> right, this I agree with.
>
>>
>> pfn_valid() will not misidentify device regions as memory (unless the
>> host is really broken) so this should fix Pavel's case. The converse
>> case (a memory region misidentified as a device) could only happen for
>> a carve-out (i.e., via the /reserved-memory node) that is mapped
>> inside the guest via a pass-through (VM_PFNMAP) mapping. That case is
>> already dodgy, since the guest accesses would be forced
>> uncached/ordered due to the fact that those mappings are forced
>> PAGE_S2_DEVICE at stage 2 (as you mention), and would also be
>> misidentified by the current code (due to the PAGE_S2_DEVICE
>> attributes)
>>
>
> ok, but such pages should never be swapped out by the host, so I think
> we're still ok here.
>

Yes, I think so, and the patch does not change how that case is handled anyway.

> Will you send an updated proper patch to the list?  You can add my
> RB if you want.
>

Thanks. I will resend with your R-b and Pavel's Tested-by added.

-- 
Ard.

      reply	other threads:[~2015-11-10 13:48 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-06 11:43 [PATCH] ARM/arm64: KVM: test properly for a PTE's uncachedness Ard Biesheuvel
2015-11-06 11:43 ` Ard Biesheuvel
2015-11-09  7:24 ` Pavel Fedin
2015-11-09  7:24   ` Pavel Fedin
2015-11-09  8:17 ` Marc Zyngier
2015-11-09  8:17   ` Marc Zyngier
2015-11-09 16:21 ` Christoffer Dall
2015-11-09 16:21   ` Christoffer Dall
2015-11-09 16:27   ` Ard Biesheuvel
2015-11-09 16:27     ` Ard Biesheuvel
2015-11-09 16:35     ` Christoffer Dall
2015-11-09 16:35       ` Christoffer Dall
2015-11-09 16:59       ` Ard Biesheuvel
2015-11-09 16:59         ` Ard Biesheuvel
     [not found]         ` <1447148737-15363-1-git-send-email-ard.biesheuvel@linaro.org>
2015-11-10  9:47           ` [PATCH v2] " Ard Biesheuvel
2015-11-10 10:27             ` Pavel Fedin
     [not found]           ` <20151110122203.GD12968@cbox>
2015-11-10 13:15             ` Ard Biesheuvel
2015-11-10 13:40               ` Christoffer Dall
2015-11-10 13:48                 ` Ard Biesheuvel [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKv+Gu9kyASJfLhO8GS3fDxSNaWSWPwRxHJ6MNv5EPjTZVD7OQ@mail.gmail.com \
    --to=ard.biesheuvel@linaro.org \
    --cc=christoffer.dall@linaro.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=marc.zyngier@arm.com \
    --cc=p.fedin@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.