All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
	Alexander Graf <agraf@suse.de>
Subject: Re: [Qemu-devel] [PATCH v4 08/18] vfio: Add DMA memory registering
Date: Mon, 2 Feb 2015 18:04:07 +1100	[thread overview]
Message-ID: <20150202070407.GN28703@voom.fritz.box> (raw)
In-Reply-To: <1422523650-2888-9-git-send-email-aik@ozlabs.ru>

[-- Attachment #1: Type: text/plain, Size: 7143 bytes --]

On Thu, Jan 29, 2015 at 08:27:20PM +1100, Alexey Kardashevskiy wrote:
> This makes use of the new "memory registering" feature. The idea is
> to provide the guest ability to notify the host kernel about pages which
> are going to be used for DMA. Having this information, the host kernel
> can pin them all, do locked pages accounting and not spent time on
> doing that in real time with possible failures which cannot be handled
> nicely in some cases.
> 
> This adds a memory listener which notifies a VFIO container about memory
> which needs to be pinned/unpinned. VFIO MMIO regions are skipped.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
>  hw/vfio/common.c              | 99 ++++++++++++++++++++++++++++++++++++++++++-
>  include/hw/vfio/vfio-common.h |  3 +-
>  2 files changed, 100 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index cf483ff..c08f9ab 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -485,6 +485,99 @@ void vfio_listener_release(VFIOContainer *container)
>      memory_listener_unregister(&container->iommu_data.type1.listener);
>  }
>  
> +static int vfio_mem_register(VFIOContainer *container,
> +                             void *vaddr, ram_addr_t size)
> +{
> +    struct vfio_iommu_type1_register_memory reg = {
> +        .argsz = sizeof(reg),
> +        .vaddr = (__u64)(uintptr_t)vaddr,
> +        .size = size,
> +    };
> +    long ret = ioctl(container->fd, VFIO_IOMMU_REGISTER_MEMORY, &reg);
> +
> +    DPRINTF("(%u) %s %u: va=%lx size=%lx, ret=%ld\n", getpid(),
> +            __func__, __LINE__, reg.vaddr, reg.size, ret);
> +
> +    return ret;
> +}
> +
> +static int vfio_mem_unregister(VFIOContainer *container,
> +                               void *vaddr, ram_addr_t size)
> +{
> +    struct vfio_iommu_type1_unregister_memory reg = {
> +        .argsz = sizeof(reg),
> +        .vaddr = (__u64)(uintptr_t)vaddr,
> +        .size = size,
> +    };
> +    long ret = ioctl(container->fd, VFIO_IOMMU_UNREGISTER_MEMORY, &reg);
> +
> +    DPRINTF("(%u) %s %u: va=%lx size=%lx, ret=%ld\n", getpid(),
> +            __func__, __LINE__, reg.vaddr, reg.size, ret);
> +
> +    return ret;
> +}
> +
> +static void vfio_ram_region_add(MemoryListener *listener,
> +                                MemoryRegionSection *section)
> +{
> +    VFIOContainer *container = container_of(listener, VFIOContainer,
> +                                            iommu_data.type1.ramlistener);
> +    hwaddr end;
> +    Int128 llend;
> +    void *vaddr;
> +
> +    if (!memory_region_is_ram(section->mr) ||
> +        memory_region_is_skip_dump(section->mr)) {
> +        return;
> +    }
> +
> +    llend = int128_make64(section->offset_within_address_space);
> +    llend = int128_add(llend, section->size);

Does this need an add TARGET_PAGE_SIZE-1 in order to make it round up
to a page boundary, rather than truncate to a page boundary?

> +    llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
> +
> +    memory_region_ref(section->mr);
> +
> +    end = int128_get64(llend);
> +    vaddr = memory_region_get_ram_ptr(section->mr) +
> +        section->offset_within_region;
> +    vfio_mem_register(container, vaddr, end);
> +}
> +
> +static void vfio_ram_region_del(MemoryListener *listener,
> +                                MemoryRegionSection *section)
> +{
> +    VFIOContainer *container = container_of(listener, VFIOContainer,
> +                                            iommu_data.type1.ramlistener);
> +    hwaddr iova, end;
> +    void *vaddr;
> +
> +    if (!memory_region_is_ram(section->mr) ||
> +        memory_region_is_skip_dump(section->mr)) {
> +        return;
> +    }
> +
> +
> +    iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
> +    end = (section->offset_within_address_space + int128_get64(section->size)) &
> +        TARGET_PAGE_MASK;
> +    vaddr = memory_region_get_ram_ptr(section->mr) +
> +        section->offset_within_region +
> +        (iova - section->offset_within_address_space);

It's not clear to me why region_add and region_del have a different
set of address calculations.

> +    vfio_mem_unregister(container, vaddr, end - iova);
> +}
> +
> +const MemoryListener vfio_ram_listener = {
> +    .region_add = vfio_ram_region_add,
> +    .region_del = vfio_ram_region_del,
> +};
> +
> +static void vfio_spapr_listener_release(VFIOContainer *container)
> +{
> +    memory_listener_unregister(&container->iommu_data.type1.listener);
> +    memory_listener_unregister(&container->iommu_data.type1.ramlistener);

Accessing fields within type1 from a function whose name says it's
spapr specific seems very wrong.

> +}
> +
>  int vfio_mmap_region(Object *obj, VFIORegion *region,
>                       MemoryRegion *mem, MemoryRegion *submem,
>                       void **map, size_t size, off_t offset,
> @@ -705,6 +798,10 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as)
>              goto free_container_exit;
>          }
>  
> +        container->iommu_data.type1.ramlistener = vfio_ram_listener;
> +        memory_listener_register(&container->iommu_data.type1.ramlistener,
> +                                 &address_space_memory);

Why two separate listeners, rather than doing both jobs from a single listener?

>          /*
>           * The host kernel code implementing VFIO_IOMMU_DISABLE is called
>           * when container fd is closed so we do not call it explicitly
> @@ -718,7 +815,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as)
>          }
>  
>          container->iommu_data.type1.listener = vfio_memory_listener;
> -        container->iommu_data.release = vfio_listener_release;
> +        container->iommu_data.release = vfio_spapr_listener_release;
>  
>          memory_listener_register(&container->iommu_data.type1.listener,
>                                   container->space->as);
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index 1d5bfe8..3f91216 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -66,6 +66,7 @@ struct VFIOGroup;
>  
>  typedef struct VFIOType1 {
>      MemoryListener listener;
> +    MemoryListener ramlistener;
>      int error;
>      bool initialized;
>  } VFIOType1;
> @@ -144,7 +145,7 @@ int vfio_get_device(VFIOGroup *group, const char *name,
>                      VFIODevice *vbasedev);
>  
>  extern const MemoryRegionOps vfio_region_ops;
> -extern const MemoryListener vfio_memory_listener;
> +extern const MemoryListener vfio_memory_listener, vfio_ram_listener;
>  extern QLIST_HEAD(vfio_group_head, VFIOGroup) vfio_group_list;
>  extern QLIST_HEAD(vfio_as_head, VFIOAddressSpace) vfio_address_spaces;
>  

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2015-02-02  7:14 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-29  9:27 [Qemu-devel] [PATCH v4 00/18] spapr: vfio: Enable Dynamic DMA windows (DDW) Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 01/18] spapr_iommu: Disable in-kernel IOMMU tables for >4GB windows Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 02/18] spapr_iommu: Make H_PUT_TCE_INDIRECT endian-safe Alexey Kardashevskiy
2015-02-02  6:30   ` David Gibson
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 03/18] spapr_pci: Introduce a liobn number generating macros Alexey Kardashevskiy
2015-02-04 15:31   ` Alexander Graf
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 04/18] spapr_vio: " Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 05/18] spapr_pci: Make find_phb()/find_dev() public Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 06/18] spapr_iommu: Make spapr_tce_find_by_liobn() public Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 07/18] spapr_iommu: Implement free_table() helper Alexey Kardashevskiy
2015-02-02  6:37   ` David Gibson
2015-02-03  1:32     ` Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 08/18] vfio: Add DMA memory registering Alexey Kardashevskiy
2015-02-02  7:04   ` David Gibson [this message]
2015-02-17  2:14     ` Alexey Kardashevskiy
2015-02-17 23:53       ` David Gibson
2015-02-17  2:20     ` [Qemu-devel] [PATCH v4 08/18] vfio: Add DMA memory registering [repost] Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 09/18] spapr_rtas: Reserve DDW RTAS token numbers Alexey Kardashevskiy
2015-02-02  7:09   ` David Gibson
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 10/18] spapr_pci: Define DDW callbacks Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 11/18] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) Alexey Kardashevskiy
2015-02-05  3:51   ` David Gibson
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 12/18] spapr_rtas: Add Dynamic DMA windows (DDW) RTAS handlers Alexey Kardashevskiy
2015-02-05  4:05   ` David Gibson
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 13/18] spapr_pci: Advertise dynamic DMA windows to guest Alexey Kardashevskiy
2015-02-05  4:10   ` David Gibson
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 14/18] vfio: Enable DDW ioctls to VFIO IOMMU driver Alexey Kardashevskiy
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 15/18] spapr_pci_vfio: Enable multiple groups per container Alexey Kardashevskiy
2015-02-05  4:19   ` David Gibson
2015-02-17  0:34     ` Alexey Kardashevskiy
2015-02-17 23:52       ` David Gibson
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 16/18] spapr_rtas_ddw: Workaround broken LE guests Alexey Kardashevskiy
2015-02-05  4:23   ` David Gibson
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 17/18] target-ppc: kvm: make use of KVM_CREATE_SPAPR_TCE_64 Alexey Kardashevskiy
2015-02-05  4:30   ` David Gibson
2015-01-29  9:27 ` [Qemu-devel] [PATCH v4 18/18] vfio: Enable in-kernel acceleration via VFIO KVM device Alexey Kardashevskiy
2015-02-05  4:49   ` David Gibson
2015-02-17  2:36     ` Alexey Kardashevskiy
2015-02-17 23:56       ` David Gibson
2015-01-30  4:01 ` [Qemu-devel] [PATCH v4 00/18] spapr: vfio: Enable Dynamic DMA windows (DDW) Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150202070407.GN28703@voom.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=agraf@suse.de \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.