All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefano Stabellini <sstabellini@kernel.org>
To: Julien Grall <julien@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>,
	 Rahul Singh <Rahul.Singh@arm.com>,
	 "lucmiccio@gmail.com" <lucmiccio@gmail.com>,
	 xen-devel <xen-devel@lists.xenproject.org>,
	 Bertrand Marquis <Bertrand.Marquis@arm.com>,
	 Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>,
	 Stefano Stabellini <stefano.stabellini@xilinx.com>
Subject: Re: [PATCH v2] xen/arm: fix gnttab_need_iommu_mapping
Date: Fri, 12 Feb 2021 10:56:17 -0800 (PST)	[thread overview]
Message-ID: <alpine.DEB.2.21.2102121055220.3234@sstabellini-ThinkPad-T480s> (raw)
In-Reply-To: <170a971e-8e10-bb9d-a324-e09e40ed994c@xen.org>

[-- Attachment #1: Type: text/plain, Size: 10379 bytes --]

On Fri, 12 Feb 2021, Julien Grall wrote:
> On 11/02/2021 20:55, Stefano Stabellini wrote:
> > On Thu, 11 Feb 2021, Julien Grall wrote:
> > > On 11/02/2021 13:20, Rahul Singh wrote:
> > > > > On 10 Feb 2021, at 7:52 pm, Julien Grall <julien@xen.org> wrote:
> > > > > On 10/02/2021 18:08, Rahul Singh wrote:
> > > > > > Hello Julien,
> > > > > > > On 10 Feb 2021, at 5:34 pm, Julien Grall <julien@xen.org> wrote:
> > > > > > > On 10/02/2021 15:06, Rahul Singh wrote:
> > > > > > > > > On 9 Feb 2021, at 8:36 pm, Stefano Stabellini
> > > > > > > > > <sstabellini@kernel.org> wrote:
> > > > > > > > > 
> > > > > > > > > On Tue, 9 Feb 2021, Rahul Singh wrote:
> > > > > > > > > > > On 8 Feb 2021, at 6:49 pm, Stefano Stabellini
> > > > > > > > > > > <sstabellini@kernel.org> wrote:
> > > > > > > > > > > 
> > > > > > > > > > > Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
> > > > > > > > > > > The offending chunk is:
> > > > > > > > > > > 
> > > > > > > > > > > #define gnttab_need_iommu_mapping(d)                    \
> > > > > > > > > > > -    (is_domain_direct_mapped(d) && need_iommu(d))
> > > > > > > > > > > +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> > > > > > > > > > > 
> > > > > > > > > > > On ARM we need gnttab_need_iommu_mapping to be true for
> > > > > > > > > > > dom0
> > > > > > > > > > > when it is
> > > > > > > > > > > directly mapped and IOMMU is enabled for the domain, like
> > > > > > > > > > > the
> > > > > > > > > > > old check
> > > > > > > > > > > did, but the new check is always false.
> > > > > > > > > > > 
> > > > > > > > > > > In fact, need_iommu_pt_sync is defined as
> > > > > > > > > > > dom_iommu(d)->need_sync and
> > > > > > > > > > > need_sync is set as:
> > > > > > > > > > > 
> > > > > > > > > > >     if ( !is_hardware_domain(d) || iommu_hwdom_strict )
> > > > > > > > > > >         hd->need_sync = !iommu_use_hap_pt(d);
> > > > > > > > > > > 
> > > > > > > > > > > iommu_use_hap_pt(d) means that the page-table used by the
> > > > > > > > > > > IOMMU is the
> > > > > > > > > > > P2M. It is true on ARM. need_sync means that you have a
> > > > > > > > > > > separate IOMMU
> > > > > > > > > > > page-table and it needs to be updated for every change.
> > > > > > > > > > > need_sync is set
> > > > > > > > > > > to false on ARM. Hence, gnttab_need_iommu_mapping(d) is
> > > > > > > > > > > false
> > > > > > > > > > > too,
> > > > > > > > > > > which is wrong.
> > > > > > > > > > > 
> > > > > > > > > > > As a consequence, when using PV network from a domU on a
> > > > > > > > > > > system where
> > > > > > > > > > > IOMMU is on from Dom0, I get:
> > > > > > > > > > > 
> > > > > > > > > > > (XEN) smmu: /smmu@fd800000: Unhandled context fault:
> > > > > > > > > > > fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
> > > > > > > > > > > [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error:
> > > > > > > > > > > HRESP not OK
> > > > > > > > > > > 
> > > > > > > > > > > The fix is to go back to something along the lines of the
> > > > > > > > > > > old
> > > > > > > > > > > implementation of gnttab_need_iommu_mapping.
> > > > > > > > > > > 
> > > > > > > > > > > Signed-off-by: Stefano Stabellini
> > > > > > > > > > > <stefano.stabellini@xilinx.com>
> > > > > > > > > > > Fixes: 91d4eca7add
> > > > > > > > > > > Backport: 4.12+
> > > > > > > > > > > 
> > > > > > > > > > > ---
> > > > > > > > > > > 
> > > > > > > > > > > Given the severity of the bug, I would like to request
> > > > > > > > > > > this
> > > > > > > > > > > patch to be
> > > > > > > > > > > backported to 4.12 too, even if 4.12 is security-fixes
> > > > > > > > > > > only
> > > > > > > > > > > since Oct
> > > > > > > > > > > 2020.
> > > > > > > > > > > 
> > > > > > > > > > > For the 4.12 backport, we can use iommu_enabled() instead
> > > > > > > > > > > of
> > > > > > > > > > > is_iommu_enabled() in the implementation of
> > > > > > > > > > > gnttab_need_iommu_mapping.
> > > > > > > > > > > 
> > > > > > > > > > > Changes in v2:
> > > > > > > > > > > - improve commit message
> > > > > > > > > > > - add is_iommu_enabled(d) to the check
> > > > > > > > > > > ---
> > > > > > > > > > > xen/include/asm-arm/grant_table.h | 2 +-
> > > > > > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > > > > > 
> > > > > > > > > > > diff --git a/xen/include/asm-arm/grant_table.h
> > > > > > > > > > > b/xen/include/asm-arm/grant_table.h
> > > > > > > > > > > index 6f585b1538..0ce77f9a1c 100644
> > > > > > > > > > > --- a/xen/include/asm-arm/grant_table.h
> > > > > > > > > > > +++ b/xen/include/asm-arm/grant_table.h
> > > > > > > > > > > @@ -89,7 +89,7 @@ int replace_grant_host_mapping(unsigned
> > > > > > > > > > > long
> > > > > > > > > > > gpaddr, mfn_t mfn,
> > > > > > > > > > >      (((i) >= nr_status_frames(t)) ? INVALID_GFN :
> > > > > > > > > > > (t)->arch.status_gfn[i])
> > > > > > > > > > > 
> > > > > > > > > > > #define gnttab_need_iommu_mapping(d)                    \
> > > > > > > > > > > -    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> > > > > > > > > > > +    (is_domain_direct_mapped(d) && is_iommu_enabled(d))
> > > > > > > > > > > 
> > > > > > > > > > > #endif /* __ASM_GRANT_TABLE_H__ */
> > > > > > > > > > 
> > > > > > > > > > I tested the patch and while creating the guest I observed
> > > > > > > > > > the
> > > > > > > > > > below warning from Linux for block device.
> > > > > > > > > > https://elixir.bootlin.com/linux/v4.3/source/drivers/block/xen-blkback/xenbus.c#L258
> > > > > > > > > 
> > > > > > > > > So you are creating a guest with "xl create" in dom0 and you
> > > > > > > > > see
> > > > > > > > > the
> > > > > > > > > warnings below printed by the Dom0 kernel? I imagine the domU
> > > > > > > > > has
> > > > > > > > > a
> > > > > > > > > virtual "disk" of some sort.
> > > > > > > > Yes you are right I am trying to create the guest with "xl
> > > > > > > > create”
> > > > > > > > and before that, I created the logical volume and trying to
> > > > > > > > attach
> > > > > > > > the logical volume
> > > > > > > > block to the domain with “xl block-attach”. I observed this
> > > > > > > > error
> > > > > > > > with the "xl block-attach” command.
> > > > > > > > This issue occurs after applying this patch as what I observed
> > > > > > > > this
> > > > > > > > patch introduce the calls to iommu_legacy_{, un}map() to map the
> > > > > > > > grant pages for
> > > > > > > > IOMMU that touches the page-tables. I am not sure but what I
> > > > > > > > observed is that something is written wrong when iomm_unmap
> > > > > > > > calls
> > > > > > > > unmap the pages
> > > > > > > > because of that issue is observed.
> > > > > > > 
> > > > > > > Can you clarify what you mean by "written wrong"? What sort of
> > > > > > > error
> > > > > > > do you see in the iommu_unmap()?
> > > > > > I might be wrong as per my understanding for ARM we are sharing the
> > > > > > P2M
> > > > > > between CPU and IOMMU always and the map_grant_ref() function is
> > > > > > written
> > > > > > in such a way that we have to call iommu_legacy_{, un}map() only if
> > > > > > P2M
> > > > > > is not shared.
> > > > > 
> > > > > map_grant_ref() will call the IOMMU if gnttab_need_iommu_mapping()
> > > > > returns
> > > > > true. I don't really see where this is assuming the P2M is not shared.
> > > > > 
> > > > > In fact, on x86, this will always be false for HVM domain (they
> > > > > support
> > > > > both shared and separate page-tables).
> > > > > 
> > > > > > As we are sharing the P2M when we call the iommu_map() function it
> > > > > > will
> > > > > > overwrite the existing GFN -> MFN ( For DOM0 GFN is same as MFN)
> > > > > > entry
> > > > > > and when we call iommu_unmap() it will unmap the  (GFN -> MFN )
> > > > > > entry
> > > > > > from the page-table.
> > > > > AFAIK, there should be nothing mapped at that GFN because the page
> > > > > belongs
> > > > > to the guest. At worse, we would overwrite a mapping that is the same.
> > > >   > Sorry I should have mention before backend/frontend is dom0 in this
> > > case and GFN is mapped. I am trying to attach the block device to DOM0
> > > 
> > > Ah, your log makes a lot more sense now. Thank you for the clarification!
> > > 
> > > So yes, I agree that iommu_{,un}map() will do the wrong thing if the
> > > frontend
> > > and backend in the same domain.
> > > 
> > > I don't know what the state in Linux, but from Xen PoV it should be
> > > possible
> > > to have the backend/frontend in the same domain.
> > > 
> > > I think we want to ignore the IOMMU mapping request when the domain is the
> > > same. Can you try this small untested patch:
> > 
> > Given that all the pages already owned by the domain should already be
> > in the shared pagetable between MMU and IOMMU, there is no need to
> > create a second mapping. In fact it is guaranteed to overlap with an
> > existing mapping.
> 
> It is **almost** guaranteed :). I can see a few reasons for this to not be
> valid:
>    - Using the domain shared info in a grant
>    - With a good timing, it would be possible that a differente vCPU remove
> the mapping after the P2M walk
> 
> That said, I feel it is not an expected behavior for a domain guest. So it is
> not something we should care at least for now.
> 
> > In theory, if guest_physmap_add_entry returned -EEXIST if a mapping
> > identical to the one we want to add is already in the pagetable, in this
> > instance we would see -EEXIST being returned.
> 
> While I agree that the GFN and MFN would be the same, there mapping still not
> be identical because the P2M type (and potentially the permission) will
> differ.
> 
> However, guest_physmap_add_entry() doesn't do such check today. It will just
> happily replace any mapping. It would be good to harden the code P2M as this
> is not the first time we see report of mapping overwritten.
> 
> I actually have a task in my todo list but I never got the chance to spend
> time on it.
> 
> > 
> > Based on that, I cannot think of unwanted side-effects for this patch.
> > It looks OK to me.
> > 
> > Given that it solves a different issue, I think it should be a separate
> > patch from [1]. Julien, are you OK with that or would you rather merge
> > the two?
> 
> They are two distinct issues. In fact, the bug has always been present on Arm.
> I will send a separate patch.

Excellent, thank you!

  reply	other threads:[~2021-02-12 18:56 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-08 18:49 [PATCH v2] xen/arm: fix gnttab_need_iommu_mapping Stefano Stabellini
2021-02-08 20:02 ` Julien Grall
2021-02-08 20:24   ` Stefano Stabellini
2021-02-08 21:23     ` Julien Grall
2021-02-09  1:57       ` Stefano Stabellini
2021-02-09  8:44         ` Julien Grall
2021-02-09 10:02     ` Jan Beulich
2021-02-09 14:03       ` Ian Jackson
2021-02-09 17:31         ` Stefano Stabellini
2021-02-09 21:04           ` Julien Grall
2021-02-10  1:35             ` Stefano Stabellini
2021-02-10 11:04           ` George Dunlap
2021-02-09 13:13 ` Rahul Singh
2021-02-09 20:36   ` Stefano Stabellini
2021-02-09 21:15     ` Stefano Stabellini
2021-02-10 15:06     ` Rahul Singh
2021-02-10 17:34       ` Julien Grall
2021-02-10 18:08         ` Rahul Singh
2021-02-10 19:52           ` Julien Grall
2021-02-11 13:20             ` Rahul Singh
2021-02-11 13:52               ` Julien Grall
2021-02-11 16:05                 ` Rahul Singh
2021-02-14 14:32                   ` Julien Grall
2021-02-15  9:12                     ` Rahul Singh
2021-02-15 11:34                       ` Julien Grall
2021-02-11 20:55                 ` Stefano Stabellini
2021-02-12 13:30                   ` Julien Grall
2021-02-12 18:56                     ` Stefano Stabellini [this message]
2021-02-10 21:13       ` Stefano Stabellini
2021-02-11 12:03         ` Rahul Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.2102121055220.3234@sstabellini-ThinkPad-T480s \
    --to=sstabellini@kernel.org \
    --cc=Bertrand.Marquis@arm.com \
    --cc=Rahul.Singh@arm.com \
    --cc=Volodymyr_Babchuk@epam.com \
    --cc=julien@xen.org \
    --cc=lucmiccio@gmail.com \
    --cc=stefano.stabellini@xilinx.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.