Re: [PATCH V5] xen/gnttab: Store frame GFN in struct page_info on Arm

From: Oleksandr <olekstysh@gmail.com>
To: Julien Grall <julien@xen.org>
Cc: "Oleksandr Tyshchenko" <Oleksandr_Tyshchenko@epam.com>,
	"Jan Beulich" <jbeulich@suse.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	"Volodymyr Babchuk" <Volodymyr_Babchuk@epam.com>,
	"Bertrand Marquis" <bertrand.marquis@arm.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"George Dunlap" <george.dunlap@citrix.com>,
	"Wei Liu" <wl@xen.org>, "Roger Pau Monné" <roger.pau@citrix.com>,
	"Bob Eshleman" <bobbyeshleman@gmail.com>,
	"Alistair Francis" <alistair.francis@wdc.com>,
	"Connor Davis" <connojdavis@gmail.com>
Subject: Re: [PATCH V5] xen/gnttab: Store frame GFN in struct page_info on Arm
Date: Thu, 10 Feb 2022 18:55:14 +0200	[thread overview]
Message-ID: <f9e3ee5b-0811-8317-42c2-61c8cdbfe254@gmail.com> (raw)
In-Reply-To: <a104d3ea-170e-8175-ac04-abfcebb4ae29@xen.org>


On 10.02.22 11:46, Julien Grall wrote:
>
>
> On 08/02/2022 19:50, Oleksandr Tyshchenko wrote:
>>
>> On 08.02.22 13:58, Julien Grall wrote:
>>> Hi,
>>
>> Hi Julien
>
> Hi,


Hi Julien


Thank you for the clarifications!


>
>>>
>>>
>>> (Jan please confirm) If I am not mistaken, on x86, a read to the M2P
>>> is not always protected. But they have code within the P2M lock to
>>> check any difference (see p2m_remove_page()). I think we would need
>>> the same, so we don't end up to introduce a behavior similar to what
>>> XSA-387 has fixed on x86.
>>
>>
>> ... OK, I assume you are speaking about the check in the loop that was
>> added by the following commit:
>> c65ea16dbcafbe4fe21693b18f8c2a3c5d14600e "x86/p2m: don't assert that the
>> passed in MFN matches for a remove"
>
> Yes, this is the one I Have in mind.


good


>
>> Also, I assume we need that check in the same place on Arm (with P2M
>> lock held), which, I think, could be p2m_remove_mapping().
>
> I believe so. 


good


> Can you do some testing to check this would not break other types of 
> mapping? (Booting a guest and using PV device should be enough).


Sure, already checked and will check more thoroughly before submitting.


>
>
>>
>> I ported the check from x86 code, but this is not a verbatim copy due to
>> the difference in local P2M helpers/macro between arches, also I have
>> skipped a part of that check "|| t == p2m_mmio_direct" which was added
>> by one of the follow-up commit:
>> 753cb68e653002e89fdcd1c80e52905fdbfb78cb "x86/p2m: guard (in particular)
>> identity mapping entries"
>> since I have no idea whether we need the same on Arm.
>
> I am not entirely sure. For now, I would drop it so long the behavior 
> stay the same (i.e. it will go ahead with removing the mappings).t.


ok, will drop.


>
>
>> Below the diff I have locally:
>>
>> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
>> index 5646343..90d7563 100644
>> --- a/xen/arch/arm/p2m.c
>> +++ b/xen/arch/arm/p2m.c
>> @@ -1315,11 +1315,32 @@ static inline int p2m_remove_mapping(struct
>> domain *d,
>>                                         mfn_t mfn)
>>    {
>>        struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> +    unsigned long i;
>>        int rc;
>>
>>        p2m_write_lock(p2m);
>> +    for ( i = 0; i < nr; )
>> +    {
>> +        unsigned int cur_order;
>> +        bool valid;
>> +        mfn_t mfn_return = p2m_get_entry(p2m, gfn_add(start_gfn, i),
>> NULL, NULL,
>> +                                         &cur_order, &valid); > +
>> +        if ( valid &&
>
> valid is a copy of the LPAE bit valid. This may be 0 if Xen decided to 
> clear it (i.e when emulating set/way). Yet the mapping itself is 
> considered valid from Xen PoV.
>
> So you want to replace with a different check (see below).


Hmm, I got it, so ...


>
>
>> +             (!mfn_valid(mfn) || !mfn_eq(mfn_add(mfn, i), 
>> mfn_return)) )
>> +        {
>> +            rc = -EILSEQ;
>> +            goto out;
>> +        }
>> +
>> +        i += (1UL << cur_order) -
>> +             ((gfn_x(start_gfn) + i) & ((1UL << cur_order) - 1));
>> +    }
>> +
>>        rc = p2m_set_entry(p2m, start_gfn, nr, INVALID_MFN,
>>                           p2m_invalid, p2m_access_rwx);
>> +
>> +out:
>>        p2m_write_unlock(p2m);
>>
>>        return rc;
>>
>>
>> Could you please clarify, is it close to what you had in mind? If yes, I
>> am wondering, don't we need this check to be only executed for xenheap
>> pages (and, probably, which P2M's entry type in RAM?) rather than for
>> all pages?
>
> From my understanding, for the purpose of this work, we only strictly 
> need to check that for xenheap pages.


  ... yes, but ...


>
>
> But I think it would be a good opportunity to harden the P2M code. At 
> the moment, on Arm, you can remove any mappings you want (even with 
> the wrong helpers). This lead us to a few issues when mapping were 
> overriden silently (in particular when building dom0).
> So I would say we should enforce it for every RAM mapping. 


... I think this makes sense, so the proper check in that case, I 
assume, should contain p2m_is_any_ram() macro:

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 5646343..2b82c49 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1315,11 +1315,32 @@ static inline int p2m_remove_mapping(struct 
domain *d,
                                       mfn_t mfn)
  {
      struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    unsigned long i;
      int rc;

      p2m_write_lock(p2m);
+    for ( i = 0; i < nr; )
+    {
+        unsigned int cur_order;
+        p2m_type_t t;
+        mfn_t mfn_return = p2m_get_entry(p2m, gfn_add(start_gfn, i), 
&t, NULL,
+                                         &cur_order, NULL);
+
+        if ( p2m_is_any_ram(t) &&
+             (!mfn_valid(mfn) || !mfn_eq(mfn_add(mfn, i), mfn_return)) )
+        {
+            rc = -EILSEQ;
+            goto out;
+        }
+
+        i += (1UL << cur_order) -
+             ((gfn_x(start_gfn) + i) & ((1UL << cur_order) - 1));
+    }
+
      rc = p2m_set_entry(p2m, start_gfn, nr, INVALID_MFN,
                         p2m_invalid, p2m_access_rwx);
+
+out:
      p2m_write_unlock(p2m);

      return rc;
(END)


> Stefano, Bertrand, what do you think?
>
> Note that, I would like to see this change in a separate commit. It 
> will be easier to review.


ok, I will introduce this check by a separate patch.


>
>
>>
>>
>>>
>>>
>>> In addition to that, if p2m_get_xenheap_gfn() is going to be called
>>> locklessly. Then we need to make sure the update to type_info are
>>> atomic. This means:
>>>   - p2m_get_xenheap_gfn() should use READ_ONCE().
>>>   - p2m_set_xenheap_gfn() should use WRITE_ONCE(). We might even need
>>> to use cmpxchg() if there are other update to type_info that are not
>>> protected. I will let you have a look.
>>
>>
>> ... OK, I didn't find READ_ONCE/WRITE_ONCE in Xen. I am wondering, can
>> we use ACCESS_ONCE instead?
>
> Yes. Sorry, I keep forgetting we don't have READ_ONCE/WRITE_ONCE in Xen.


ok


>
>>
>> Below the diff I have locally:
>>
>> diff --git a/xen/arch/arm/include/asm/mm.h 
>> b/xen/arch/arm/include/asm/mm.h
>> index 9e093a6..b18acb7 100644
>> --- a/xen/arch/arm/include/asm/mm.h
>> +++ b/xen/arch/arm/include/asm/mm.h
>> @@ -373,7 +373,7 @@ unsigned int arch_get_dma_bitsize(void);
>>
>>    static inline gfn_t page_get_xenheap_gfn(const struct page_info *p)
>>    {
>> -    gfn_t gfn_ = _gfn(p->u.inuse.type_info & PGT_gfn_mask);
>> +    gfn_t gfn_ = _gfn(ACCESS_ONCE(p->u.inuse.type_info) & 
>> PGT_gfn_mask);
>>
>>        ASSERT(is_xen_heap_page(p));
>>
>> @@ -383,11 +383,14 @@ static inline gfn_t page_get_xenheap_gfn(const
>> struct page_info *p)
>>    static inline void page_set_xenheap_gfn(struct page_info *p, gfn_t 
>> gfn)
>>    {
>>        gfn_t gfn_ = gfn_eq(gfn, INVALID_GFN) ? 
>> PGT_INVALID_XENHEAP_GFN : gfn;
>> +    unsigned long type_info;
>>
>>        ASSERT(is_xen_heap_page(p));
>>
>> -    p->u.inuse.type_info &= ~PGT_gfn_mask;
>> -    p->u.inuse.type_info |= gfn_x(gfn_);
>> +    type_info = ACCESS_ONCE(p->u.inuse.type_info);
>> +    type_info &= ~PGT_gfn_mask;
>> +    type_info |= gfn_x(gfn_);
>> +    ACCESS_ONCE(p->u.inuse.type_info) = type_info;
>>    }
>>
>>    #endif /*  __ARCH_ARM_MM__ */
>>
>>
>> It is going to be a non-protected write to GFN portion of type_info.
>
> Well no. You are using a Read-Modify-Write operation on type_info. 
> This is not atomic and will overwrite any change (if any) done on 
> other part of the type_info.


I am confused a bit, to which my comment your comment above belongs (to 
the diff for page_set_xenheap_gfn() above or to sentence right after it)?
The "It is going to be a non-protected write to GFN portion of 
type_info." sentence is related to the diff for alloc_heap_pages() 
below. Sorry if I didn't separate the comments properly.


>
>
> If I am mistaken, there are two other places where type_info is 
> modified. One is...
>
>
>> But, at that time the page is not used yet, so I think this is harmless.
>>
>> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
>> index 50334a0..97cf0d8 100644
>> --- a/xen/common/page_alloc.c
>> +++ b/xen/common/page_alloc.c
>> @@ -1024,7 +1024,7 @@ static struct page_info *alloc_heap_pages(
>>                                     &tlbflush_timestamp);
>>
>>             /* Initialise fields which have other uses for free 
>> pages. */
>> -        pg[i].u.inuse.type_info = 0;
>> +        pg[i].u.inuse.type_info = PGT_TYPE_INFO_INITIALIZER;
>>             page_set_owner(&pg[i], NULL);
>>
>>         }
>
> ... this one. I agree the page is not accessible at this time. So 
> page_set_xenheap_gfn() should not be used.


yes


>
>
> The other one is in share_xen_page_with_guest() which I think is still 
> fine because the caller page_set_xenheap_gfn() would need to acquire a 
> reference on the page. This is only possible after the count_info is 
> updated in share_xen_page_with_guest() *and* there a barrier between 
> the type_info and count_info.
>
> I think this behavior should be documented on top of type_info (along 
> with the locking). This would be helpful if type_info gain more use in 
> the future.


agree, will do.


>
>
> Cheers,
>
-- 
Regards,

Oleksandr Tyshchenko