All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, mst@redhat.com, jasowang@redhat.com,
	vkaplans@redhat.com, alex.williamson@redhat.com, wexu@redhat.com,
	pbonzini@redhat.com, cornelia.huck@de.ibm.com,
	dgibson@redhat.com
Subject: Re: [Qemu-devel] [PATCH v3 1/3] memory: introduce IOMMUNotifier and its caps
Date: Wed, 7 Sep 2016 16:02:39 +1000	[thread overview]
Message-ID: <20160907060239.GP2780@voom.fritz.box> (raw)
In-Reply-To: <1473226344-28520-2-git-send-email-peterx@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 11120 bytes --]

On Wed, Sep 07, 2016 at 01:32:22PM +0800, Peter Xu wrote:
> IOMMU Notifier list is used for notifying IO address mapping changes.
> Currently VFIO is the only user.
> 
> However it is possible that future consumer like vhost would like to
> only listen to part of its notifications (e.g., cache invalidations).
> 
> This patch introduced IOMMUNotifier and IOMMUNotfierCap bits for a finer
> grained control of it.
> 
> IOMMUNotifier contains a bitfield for the notify consumer describing
> what kind of notification it is interested in. Currently two kinds of
> notifications are defined:
> 
> - IOMMU_NOTIFIER_CHANGE:       for entry changes (additions)
> - IOMMU_NOTIFIER_INVALIDATION: for entry removals (cache invalidates)

As noted on the other thread, I think the correct options for your
bitmap here are "map" and "unmap".  Which are triggered depends on the
permissions / existence of the *previous* mapping, as well as the new
one.

You could in fact have "map-read", "map-write", "unmap-read",
"unmap-write" as separate bitmap options (e.g. changing a mapping from
RO to WO would be both a read-unmap and write-map event).  I can't see
any real use for that though, so just "map" and "unmap" are probably
sufficient.

> When registering the IOMMU notifier, we need to specify one or multiple
> capability bit(s) to listen to.
> 
> When notifications are triggered, it will be checked against the
> notifier's capability bits, and only notifiers with registered bits will
> be notified.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  hw/vfio/common.c              |  3 ++-
>  include/exec/memory.h         | 39 ++++++++++++++++++++++++++++++++-------
>  include/hw/vfio/vfio-common.h |  2 +-
>  memory.c                      | 37 ++++++++++++++++++++++++++++---------
>  4 files changed, 63 insertions(+), 18 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index b313e7c..b0cea2c 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -293,7 +293,7 @@ static bool vfio_listener_skipped_section(MemoryRegionSection *section)
>             section->offset_within_address_space & (1ULL << 63);
>  }
>  
> -static void vfio_iommu_map_notify(Notifier *n, void *data)
> +static void vfio_iommu_map_notify(IOMMUNotifier *n, void *data)
>  {
>      VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
>      VFIOContainer *container = giommu->container;
> @@ -454,6 +454,7 @@ static void vfio_listener_region_add(MemoryListener *listener,
>                                 section->offset_within_region;
>          giommu->container = container;
>          giommu->n.notify = vfio_iommu_map_notify;
> +        giommu->n.notifier_caps = IOMMU_NOTIFIER_ALL;

"caps" isn't really right.  It's a *requirement* that VFIO get all the
notifications, not a capability.  "caps" would only make sense on the
other side (the vIOMMU implementation).

>          QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
>  
>          memory_region_register_iommu_notifier(giommu->iommu, &giommu->n);
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 3e4d416..92f14db 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -67,6 +67,28 @@ struct IOMMUTLBEntry {
>      IOMMUAccessFlags perm;
>  };
>  
> +/*
> + * Bitmap for differnet IOMMUNotifier capabilities. Each notifier can
> + * register with one or multiple IOMMU Notifier capability bit(s).
> + */
> +typedef enum {
> +    IOMMU_NOTIFIER_NONE = 0,
> +    /* Notify cache invalidations */
> +    IOMMU_NOTIFIER_INVALIDATION = 0x1,
> +    /* Notify entry changes (newly created entries) */
> +    IOMMU_NOTIFIER_CHANGE = 0x2,
> +} IOMMUNotifierCap;
> +
> +#define IOMMU_NOTIFIER_ALL (IOMMU_NOTIFIER_INVALIDATION | \
> +                            IOMMU_NOTIFIER_CHANGE)
> +
> +struct IOMMUNotifier {
> +    void (*notify)(struct IOMMUNotifier *notifier, void *data);
> +    IOMMUNotifierCap notifier_caps;
> +    QLIST_ENTRY(IOMMUNotifier) node;
> +};
> +typedef struct IOMMUNotifier IOMMUNotifier;
> +
>  /* New-style MMIO accessors can indicate that the transaction failed.
>   * A zero (MEMTX_OK) response means success; anything else is a failure
>   * of some kind. The memory subsystem will bitwise-OR together results
> @@ -201,7 +223,7 @@ struct MemoryRegion {
>      const char *name;
>      unsigned ioeventfd_nb;
>      MemoryRegionIoeventfd *ioeventfds;
> -    NotifierList iommu_notify;
> +    QLIST_HEAD(, IOMMUNotifier) iommu_notify;
>  };
>  
>  /**
> @@ -620,11 +642,12 @@ void memory_region_notify_iommu(MemoryRegion *mr,
>   * IOMMU translation entries.
>   *
>   * @mr: the memory region to observe
> - * @n: the notifier to be added; the notifier receives a pointer to an
> - *     #IOMMUTLBEntry as the opaque value; the pointer ceases to be
> - *     valid on exit from the notifier.
> + * @n: the IOMMUNotifier to be added; the notify callback receives a
> + *     pointer to an #IOMMUTLBEntry as the opaque value; the pointer
> + *     ceases to be valid on exit from the notifier.
>   */
> -void memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n);
> +void memory_region_register_iommu_notifier(MemoryRegion *mr,
> +                                           IOMMUNotifier *n);

It seems to me that this should be allowed to fail, if the notifier
you're trying to register requires notifications that the MR
implementation can't supply.  That seems cleaner than delaying the
checking until the notification actually happens.

>  /**
>   * memory_region_iommu_replay: replay existing IOMMU translations to
> @@ -636,7 +659,8 @@ void memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n);
>   * @is_write: Whether to treat the replay as a translate "write"
>   *     through the iommu
>   */
> -void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n, bool is_write);
> +void memory_region_iommu_replay(MemoryRegion *mr, IOMMUNotifier *n,
> +                                bool is_write);
>  
>  /**
>   * memory_region_unregister_iommu_notifier: unregister a notifier for
> @@ -646,7 +670,8 @@ void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n, bool is_write);
>   *      needs to be called
>   * @n: the notifier to be removed.
>   */
> -void memory_region_unregister_iommu_notifier(MemoryRegion *mr, Notifier *n);
> +void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
> +                                             IOMMUNotifier *n);
>  
>  /**
>   * memory_region_name: get a memory region's name
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index 94dfae3..c17602e 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -93,7 +93,7 @@ typedef struct VFIOGuestIOMMU {
>      VFIOContainer *container;
>      MemoryRegion *iommu;
>      hwaddr iommu_offset;
> -    Notifier n;
> +    IOMMUNotifier n;
>      QLIST_ENTRY(VFIOGuestIOMMU) giommu_next;
>  } VFIOGuestIOMMU;
>  
> diff --git a/memory.c b/memory.c
> index 0eb6895..45a3902 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -1418,7 +1418,7 @@ void memory_region_init_iommu(MemoryRegion *mr,
>      memory_region_init(mr, owner, name, size);
>      mr->iommu_ops = ops,
>      mr->terminates = true;  /* then re-forwards */
> -    notifier_list_init(&mr->iommu_notify);
> +    QLIST_INIT(&mr->iommu_notify);
>  }
>  
>  static void memory_region_finalize(Object *obj)
> @@ -1513,13 +1513,16 @@ bool memory_region_is_logging(MemoryRegion *mr, uint8_t client)
>      return memory_region_get_dirty_log_mask(mr) & (1 << client);
>  }
>  
> -void memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n)
> +void memory_region_register_iommu_notifier(MemoryRegion *mr,
> +                                           IOMMUNotifier *n)
>  {
> +    /* We need to register for at least one bitfield */
> +    assert(n->notifier_caps != IOMMU_NOTIFIER_NONE);

Not sure if it makes sense to implement NOTIFIER_NONE as a no-op just
for orthogonality.

>      if (mr->iommu_ops->notify_started &&
> -        QLIST_EMPTY(&mr->iommu_notify.notifiers)) {
> +        QLIST_EMPTY(&mr->iommu_notify)) {
>          mr->iommu_ops->notify_started(mr);

As noted above, I think register_notify should get the ability to
fail, which would happen if notify_started() failed (obviously it
needs to get a failure mode as well.  Basically notify_started is
required to check that this vIOMMU is able to supply the notifications
that have been requested.

>      }
> -    notifier_list_add(&mr->iommu_notify, n);
> +    QLIST_INSERT_HEAD(&mr->iommu_notify, n, node);
>  }
>  
>  uint64_t memory_region_iommu_get_min_page_size(MemoryRegion *mr)
> @@ -1531,7 +1534,8 @@ uint64_t memory_region_iommu_get_min_page_size(MemoryRegion *mr)
>      return TARGET_PAGE_SIZE;
>  }
>  
> -void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n, bool is_write)
> +void memory_region_iommu_replay(MemoryRegion *mr, IOMMUNotifier *n,
> +                                bool is_write)
>  {
>      hwaddr addr, granularity;
>      IOMMUTLBEntry iotlb;
> @@ -1552,11 +1556,12 @@ void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n, bool is_write)
>      }
>  }
>  
> -void memory_region_unregister_iommu_notifier(MemoryRegion *mr, Notifier *n)
> +void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
> +                                             IOMMUNotifier *n)
>  {
> -    notifier_remove(n);
> +    QLIST_REMOVE(n, node);
>      if (mr->iommu_ops->notify_stopped &&
> -        QLIST_EMPTY(&mr->iommu_notify.notifiers)) {
> +        QLIST_EMPTY(&mr->iommu_notify)) {
>          mr->iommu_ops->notify_stopped(mr);
>      }
>  }
> @@ -1564,8 +1569,22 @@ void memory_region_unregister_iommu_notifier(MemoryRegion *mr, Notifier *n)
>  void memory_region_notify_iommu(MemoryRegion *mr,
>                                  IOMMUTLBEntry entry)
>  {
> +    IOMMUNotifier *iommu_notifier;
> +    IOMMUNotifierCap request_cap;
> +
>      assert(memory_region_is_iommu(mr));
> -    notifier_list_notify(&mr->iommu_notify, &entry);
> +
> +    if (entry.perm & IOMMU_RW) {
> +        request_cap = IOMMU_NOTIFIER_CHANGE;
> +    } else {
> +        request_cap = IOMMU_NOTIFIER_INVALIDATION;
> +    }

As noted right at the top, I don't think this logic is really right.
An in-place change should be treated as both a map and unmap.

> +    QLIST_FOREACH(iommu_notifier, &mr->iommu_notify, node) {
> +        if (iommu_notifier->notifier_caps & request_cap) {
> +            iommu_notifier->notify(iommu_notifier, &entry);
> +        }
> +    }
>  }
>  
>  void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client)

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

  reply	other threads:[~2016-09-07  6:03 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-07  5:32 [Qemu-devel] [PATCH v3 0/3] Introduce IOMMUNotifier struct Peter Xu
2016-09-07  5:32 ` [Qemu-devel] [PATCH v3 1/3] memory: introduce IOMMUNotifier and its caps Peter Xu
2016-09-07  6:02   ` David Gibson [this message]
2016-09-07  7:09     ` Peter Xu
2016-09-07 10:20       ` David Gibson
2016-09-08 10:00         ` Peter Xu
2016-09-07  5:32 ` [Qemu-devel] [PATCH v3 2/3] memory: generalize iommu_ops.notify_started to notifier_add Peter Xu
2016-09-07  6:05   ` David Gibson
2016-09-07  7:23     ` Peter Xu
2016-09-07 10:23       ` David Gibson
2016-09-07 10:54     ` Paolo Bonzini
2016-09-08 10:22       ` Peter Xu
2016-09-12  1:17         ` David Gibson
2016-09-12  5:54           ` Peter Xu
2016-09-07  5:32 ` [Qemu-devel] [PATCH v3 3/3] intel_iommu: allow invalidation typed notifiers Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160907060239.GP2780@voom.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=alex.williamson@redhat.com \
    --cc=cornelia.huck@de.ibm.com \
    --cc=dgibson@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=vkaplans@redhat.com \
    --cc=wexu@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.