All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Liu, Yi L" <yi.l.liu@intel.com>
To: Jason Wang <jasowang@redhat.com>, 'Peter Xu' <peterx@redhat.com>
Cc: "Lan, Tianyu" <tianyu.lan@intel.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	"'mst@redhat.com'" <mst@redhat.com>,
	"'jan.kiszka@siemens.com'" <jan.kiszka@siemens.com>,
	"'bd.aviv@gmail.com'" <bd.aviv@gmail.com>,
	"'qemu-devel@nongnu.org'" <qemu-devel@nongnu.org>,
	"'alex.williamson@redhat.com'" <alex.williamson@redhat.com>,
	'David Gibson' <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add MemoryRegionIOMMUOps.replay() callback
Date: Fri, 31 Mar 2017 07:30:22 +0000	[thread overview]
Message-ID: <A2975661238FB949B60364EF0F2C257439036DF7@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <81e79982-4af0-3b46-552c-eea4db05a362@redhat.com>

> -----Original Message-----
> From: Jason Wang [mailto:jasowang@redhat.com]
> Sent: Friday, March 31, 2017 3:17 PM
> To: Liu, Yi L <yi.l.liu@intel.com>; 'Peter Xu' <peterx@redhat.com>
> Cc: Lan, Tianyu <tianyu.lan@intel.com>; Tian, Kevin <kevin.tian@intel.com>;
> 'mst@redhat.com' <mst@redhat.com>; 'jan.kiszka@siemens.com'
> <jan.kiszka@siemens.com>; 'bd.aviv@gmail.com' <bd.aviv@gmail.com>; 'qemu-
> devel@nongnu.org' <qemu-devel@nongnu.org>; 'alex.williamson@redhat.com'
> <alex.williamson@redhat.com>; 'David Gibson' <david@gibson.dropbear.id.au>
> Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add
> MemoryRegionIOMMUOps.replay() callback
> 
> 
> 
> On 2017年03月31日 13:34, Liu, Yi L wrote:
> >> -----Original Message-----
> >> From: Jason Wang [mailto:jasowang@redhat.com]
> >> Sent: Thursday, March 30, 2017 7:58 PM
> >> To: Liu, Yi L <yi.l.liu@intel.com>; 'Peter Xu' <peterx@redhat.com>
> >> Cc: 'alex.williamson@redhat.com' <alex.williamson@redhat.com>; Lan,
> >> Tianyu <tianyu.lan@intel.com>; Tian, Kevin <kevin.tian@intel.com>;
> 'mst@redhat.com'
> >> <mst@redhat.com>; 'jan.kiszka@siemens.com' <jan.kiszka@siemens.com>;
> >> 'bd.aviv@gmail.com' <bd.aviv@gmail.com>; 'David Gibson'
> >> <david@gibson.dropbear.id.au>; 'qemu-devel@nongnu.org' <qemu-
> >> devel@nongnu.org>
> >> Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add
> >> MemoryRegionIOMMUOps.replay() callback
> >>
> >>
> >>
> >> On 2017年03月30日 19:06, Liu, Yi L wrote:
> >>>> -----Original Message-----
> >>>> From: Liu, Yi L
> >>>> Sent: Monday, March 27, 2017 5:22 PM
> >>>> To: Peter Xu <peterx@redhat.com>
> >>>> Cc: alex.williamson@redhat.com; Lan, Tianyu <tianyu.lan@intel.com>;
> >>>> Tian, Kevin <kevin.tian@intel.com>; mst@redhat.com;
> >>>> jan.kiszka@siemens.com; jasowang@redhat.com; bd.aviv@gmail.com;
> >>>> David Gibson <david@gibson.dropbear.id.au>; qemu-devel@nongnu.org
> >>>> Subject: RE: [Qemu-devel] [PATCH v7 14/17] memory: add
> >>>> MemoryRegionIOMMUOps.replay() callback
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Peter Xu [mailto:peterx@redhat.com]
> >>>>> Sent: Monday, March 27, 2017 5:12 PM
> >>>>> To: Liu, Yi L <yi.l.liu@intel.com>
> >>>>> Cc: alex.williamson@redhat.com; Lan, Tianyu
> >>>>> <tianyu.lan@intel.com>; Tian, Kevin <kevin.tian@intel.com>;
> >>>>> mst@redhat.com; jan.kiszka@siemens.com; jasowang@redhat.com;
> >>>>> bd.aviv@gmail.com; David Gibson <david@gibson.dropbear.id.au>;
> >>>>> qemu-devel@nongnu.org
> >>>>> Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add
> >>>>> MemoryRegionIOMMUOps.replay() callback
> >>>>>
> >>>>> On Mon, Mar 27, 2017 at 08:35:05AM +0000, Liu, Yi L wrote:
> >>>>>>> -----Original Message-----
> >>>>>>> From: Qemu-devel
> >>>>>>> [mailto:qemu-devel-bounces+yi.l.liu=intel.com@nongnu.org] On
> >>>>>>> Behalf Of Peter Xu
> >>>>>>> Sent: Tuesday, February 7, 2017 4:28 PM
> >>>>>>> To: qemu-devel@nongnu.org
> >>>>>>> Cc: Lan, Tianyu <tianyu.lan@intel.com>; Tian, Kevin
> >>>>>>> <kevin.tian@intel.com>; mst@redhat.com; jan.kiszka@siemens.com;
> >>>>>>> jasowang@redhat.com; peterx@redhat.com;
> >>>>>>> alex.williamson@redhat.com; bd.aviv@gmail.com; David Gibson
> >>>>>>> <david@gibson.dropbear.id.au>
> >>>>>>> Subject: [Qemu-devel] [PATCH v7 14/17] memory: add
> >>>>>>> MemoryRegionIOMMUOps.replay() callback
> >>>>>>>
> >>>>>>> Originally we have one memory_region_iommu_replay() function,
> >>>>>>> which is the default behavior to replay the translations of the
> >>>>>>> whole IOMMU region. However, on some platform like x86, we may
> >>>>>>> want our own
> >>>>> replay logic for IOMMU regions.
> >>>>>>> This patch add one more hook for IOMMUOps for the callback, and
> >>>>>>> it'll override the default if set.
> >>>>>>>
> >>>>>>> Signed-off-by: Peter Xu <peterx@redhat.com>
> >>>>>>> ---
> >>>>>>>    include/exec/memory.h | 2 ++
> >>>>>>>    memory.c              | 6 ++++++
> >>>>>>>    2 files changed, 8 insertions(+)
> >>>>>>>
> >>>>>>> diff --git a/include/exec/memory.h b/include/exec/memory.h index
> >>>>>>> 0767888..30b2a74 100644
> >>>>>>> --- a/include/exec/memory.h
> >>>>>>> +++ b/include/exec/memory.h
> >>>>>>> @@ -191,6 +191,8 @@ struct MemoryRegionIOMMUOps {
> >>>>>>>        void (*notify_flag_changed)(MemoryRegion *iommu,
> >>>>>>>                                    IOMMUNotifierFlag old_flags,
> >>>>>>>                                    IOMMUNotifierFlag new_flags);
> >>>>>>> +    /* Set this up to provide customized IOMMU replay function */
> >>>>>>> +    void (*replay)(MemoryRegion *iommu, IOMMUNotifier
> >>>>>>> + *notifier);
> >>>>>>>    };
> >>>>>>>
> >>>>>>>    typedef struct CoalescedMemoryRange CoalescedMemoryRange;
> >>>>>>> diff --git a/memory.c b/memory.c index 7a4f2f9..9c253cc 100644
> >>>>>>> --- a/memory.c
> >>>>>>> +++ b/memory.c
> >>>>>>> @@ -1630,6 +1630,12 @@ void
> >>>>>>> memory_region_iommu_replay(MemoryRegion
> >>>>>>> *mr, IOMMUNotifier *n,
> >>>>>>>        hwaddr addr, granularity;
> >>>>>>>        IOMMUTLBEntry iotlb;
> >>>>>>> +    /* If the IOMMU has its own replay callback, override */
> >>>>>>> +    if (mr->iommu_ops->replay) {
> >>>>>>> +        mr->iommu_ops->replay(mr, n);
> >>>>>>> +        return;
> >>>>>>> +    }
> >>>>>> Hi Alex, Peter,
> >>>>>>
> >>>>>> Will all the other vendors(e.g. PPC, s390, ARM) add their own
> >>>>>> replay callback as well? I guess it depends on whether the
> >>>>>> original replay algorithm work well for them? Do you have such knowledge?
> >>>>> I guess so. At least for VT-d we had this callback since the
> >>>>> default replay mechanism did not work well on x86 due to its
> >>>>> extremely large memory region size. Thanks,
> >>>> thx. that would make sense.
> >>> Peter,
> >>>
> >>> Just come to mind that there may be a corner case here.
> >>>
> >>> Intel VT-d actually has a "pt" mode which allows device use physical
> >>> address even when VT-d is enabled. In kernel, there is a
> iommu_identity_mapping.
> >>> If a device is in this map, then it would use "pt" mode. So that
> >>> IOMMU driver would not build second-level page table for it.
> >> Yes, but qemu does not support ECAP_PT now, so guest will still have
> >> a page table in this case.
> > That's true. Without ECAP_PT, IOMMU driver would create a 1:1 map. So
> > this solution can work well even a device is in identify_map.
> >
> >>> Back to the virtual IOVA implementation, if an assigned device is in
> >>> the iommu_identity_mapping(e.g. VGA controller), it uses GPA directly to do
> DMA.
> >>> So it demands a GPA->HPA mapping in host. However, the
> >>> iommu->ops.replay is not able to build it when guest SL page table is empty.
> >>>
> >>> So I think building an entire guest PA->HPA mapping before guest
> >>> kernel boot would be recommended. Any thoughts?
> >> We plan to add PT in 2.10, a possible rough idea is disabled iommu
> >> dmar region and use another region without iommu_ops. Then
> >> vfio_listener_region_add() will just do the correct mappings.
> > Good to know it. Actually, I also need to expose ECAP_PT for vSVM. So
> > just comes to realize that the current replay solution may not work well when I
> expose ECAP_PT to guest.
> > I also have a rough idea here. The current listener in container
> > listens to address space named with devfn if virtual VTd is added. How
> > about adding one more listener to listen memory address space. So that the
> listener can build entire guest PA->HPA mapping.
> 
> This is only needed for PT. So looks like current code is sufficient to do this I think.
> See the else part of if (memory_region_is_iommu()) of vfio_listener_region_add().

Jason, when the listener listen to device address space, the "else part" may not work
even we set the mr->iommu_ops = NULL. The mr would be a non-ram region when the
time region_add is called since it is actually listen to changes from device address space.

Regards,
Yi L


  reply	other threads:[~2017-03-31  7:30 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-07  8:28 [Qemu-devel] [PATCH v7 00/17] VT-d: vfio enablement and misc enhances Peter Xu
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 01/17] vfio: trace map/unmap for notify as well Peter Xu
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 02/17] vfio: introduce vfio_get_vaddr() Peter Xu
2017-02-10  1:12   ` David Gibson
2017-02-10  5:50     ` Peter Xu
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 03/17] vfio: allow to notify unmap for very large region Peter Xu
2017-02-10  1:13   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 04/17] intel_iommu: add "caching-mode" option Peter Xu
2017-02-10  1:14   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 05/17] intel_iommu: simplify irq region translation Peter Xu
2017-02-10  1:15   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 06/17] intel_iommu: renaming gpa to iova where proper Peter Xu
2017-02-10  1:17   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 07/17] intel_iommu: convert dbg macros to traces for inv Peter Xu
2017-02-08  2:47   ` Jason Wang
2017-02-10  1:19   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 08/17] intel_iommu: convert dbg macros to trace for trans Peter Xu
2017-02-08  2:49   ` Jason Wang
2017-02-10  1:20   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 09/17] intel_iommu: vtd_slpt_level_shift check level Peter Xu
2017-02-10  1:20   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 10/17] memory: add section range info for IOMMU notifier Peter Xu
2017-02-10  2:29   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 11/17] memory: provide IOMMU_NOTIFIER_FOREACH macro Peter Xu
2017-02-10  2:30   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 12/17] memory: provide iommu_replay_all() Peter Xu
2017-02-10  2:31   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 13/17] memory: introduce memory_region_notify_one() Peter Xu
2017-02-10  2:33   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 14/17] memory: add MemoryRegionIOMMUOps.replay() callback Peter Xu
2017-02-10  2:34   ` David Gibson
2017-03-27  8:35   ` Liu, Yi L
2017-03-27  9:12     ` Peter Xu
2017-03-27  9:21       ` Liu, Yi L
2017-03-30 11:06         ` Liu, Yi L
2017-03-30 11:57           ` Jason Wang
2017-03-31  2:56             ` Peter Xu
2017-03-31  4:21               ` Jason Wang
2017-03-31  5:01                 ` Peter Xu
2017-03-31  5:12                   ` Jason Wang
2017-03-31  5:28                     ` Peter Xu
2017-03-31  5:34             ` Liu, Yi L
2017-03-31  7:16               ` Jason Wang
2017-03-31  7:30                 ` Liu, Yi L [this message]
2017-04-01  5:00                   ` Jason Wang
2017-04-01  6:39                     ` Liu, Yi L
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 15/17] intel_iommu: provide its own replay() callback Peter Xu
2017-02-10  2:36   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 16/17] intel_iommu: allow dynamic switch of IOMMU region Peter Xu
2017-02-10  2:38   ` David Gibson
2017-02-07  8:28 ` [Qemu-devel] [PATCH v7 17/17] intel_iommu: enable vfio devices Peter Xu
2017-02-10  6:24   ` Jason Wang
2017-03-16  4:05   ` Peter Xu
2017-03-19 15:34     ` Aviv B.D.
2017-03-20  1:56       ` Peter Xu
2017-03-20  2:12         ` Liu, Yi L
2017-03-20  2:41           ` Peter Xu
2017-02-17 17:18 ` [Qemu-devel] [PATCH v7 00/17] VT-d: vfio enablement and misc enhances Alex Williamson
2017-02-20  7:47   ` Peter Xu
2017-02-20  8:17     ` Liu, Yi L
2017-02-20  8:32       ` Peter Xu
2017-02-20 19:15     ` Alex Williamson
2017-02-28  7:52 ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=A2975661238FB949B60364EF0F2C257439036DF7@shsmsx102.ccr.corp.intel.com \
    --to=yi.l.liu@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=bd.aviv@gmail.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=jan.kiszka@siemens.com \
    --cc=jasowang@redhat.com \
    --cc=kevin.tian@intel.com \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=tianyu.lan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.