All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-ppc@nongnu.org,
	Alexander Graf <agraf@suse.de>,
	David Gibson <david@gibson.dropbear.id.au>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [PATCH qemu v16 01/19] vfio: Delay DMA address space listener release
Date: Fri, 13 May 2016 17:16:48 +1000	[thread overview]
Message-ID: <0f2ca845-0c9a-5446-ea69-371ea866319e@ozlabs.ru> (raw)
In-Reply-To: <20160505163941.7628431c@t450s.home>

On 05/06/2016 08:39 AM, Alex Williamson wrote:
> On Wed,  4 May 2016 16:52:13 +1000
> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>
>> This postpones VFIO container deinitialization to let region_del()
>> callbacks (called via vfio_listener_release) do proper clean up
>> while the group is still attached to the container.
>
> Any mappings within the container should clean themselves up when the
> container is deprivleged by removing the last group in the kernel. Is
> the issue that that doesn't happen, which would be a spapr vfio kernel
> bug, or that our QEMU side structures get all out of whack if we let
> that happen?

My mailbase got corrupted, missed that.

This is mostly for "[PATCH qemu v16 17/19] spapr_iommu, vfio, memory: 
Notify IOMMU about starting/stopping being used by VFIO", I should have put 
01/19 and 02/19 right before 17/19, sorry about that.


Every reboot the spapr machine removes all (i.e. one or two) windows and 
creates the default one.

I do this by memory_region_del_subregion(iommu_mr) + 
memory_region_add_subregion(iommu_mr). Which gets translated to 
VFIO_IOMMU_SPAPR_TCE_REMOVE + VFIO_IOMMU_SPAPR_TCE_CREATE via 
vfio_memory_listener if there is VFIO; no direct calls from spapr to vfio 
=> cool. During the machine reset, the VFIO device is there with the 
container and groups attached, at some point with no windows.

Now to VFIO plug/unplug.

When VFIO plug happens, vfio_memory_listener is created, region_add() is 
called, the hardware window is created (via VFIO_IOMMU_SPAPR_TCE_CREATE).
Unplugging should end up doing VFIO_IOMMU_SPAPR_TCE_REMOVE somehow. If 
region_del() is not called when the container is being destroyed (as before 
this patchset), then the kernel cleans and destroys windows when 
close(container->fd) is called or when qemu is killed (and this fd is 
naturally closed), I hope this answers the comment from 02/19.

So far so good (right?)

However I also have a guest view of the TCE table, this is what the guest 
sees and this is what emulated PCI devices use. This guest view is either 
allocated in the KVM (so H_PUT_TCE can be handled quickly right in the host 
kernel, even in real mode) or userspace (VFIO case).

I generally want the guest view to be in the KVM. However when I plug VFIO, 
I have to move the table to the userspace. When I unplug VFIO, I want to do 
the opposite so I need a way to tell spapr that it can move the table. 
region_del() seemed a natural way of doing this as region_add() is already 
doing the opposite part.

With this patchset, each IOMMU MR gets a usage counter, region_add() does 
+1, region_del() does -1 (yeah, not extremely optimal during reset). When 
the counter goes from 0 to 1, vfio_start() hook is called, when the counter 
becomes 0 - vfio_stop(). Note that we may have multiple VFIO containers on 
the same PHB.

Without 01/19 and 02/19, I'll have to repeat region_del()'s counter 
decrement steps in vfio_disconnect_container(). And I still cannot move 
counting from region_add() to vfio_connect_container() so there will be 
asymmetry which I am fine with, I am just checking here - what would be the 
best approach here?


Thanks.



>
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>>  hw/vfio/common.c | 22 +++++++++++++++-------
>>  1 file changed, 15 insertions(+), 7 deletions(-)
>>
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index fe5ec6a..0b40262 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -921,23 +921,31 @@ static void vfio_disconnect_container(VFIOGroup *group)
>>  {
>>      VFIOContainer *container = group->container;
>>
>> -    if (ioctl(group->fd, VFIO_GROUP_UNSET_CONTAINER, &container->fd)) {
>> -        error_report("vfio: error disconnecting group %d from container",
>> -                     group->groupid);
>> -    }
>> -
>>      QLIST_REMOVE(group, container_next);
>> +
>> +    if (QLIST_EMPTY(&container->group_list)) {
>> +        VFIOGuestIOMMU *giommu;
>> +
>> +        vfio_listener_release(container);
>> +
>> +        QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) {
>> +            memory_region_unregister_iommu_notifier(&giommu->n);
>> +        }
>> +    }
>> +
>>      group->container = NULL;
>> +    if (ioctl(group->fd, VFIO_GROUP_UNSET_CONTAINER, &container->fd)) {
>> +        error_report("vfio: error disconnecting group %d from container",
>> +                     group->groupid);
>> +    }
>>
>>      if (QLIST_EMPTY(&container->group_list)) {
>>          VFIOAddressSpace *space = container->space;
>>          VFIOGuestIOMMU *giommu, *tmp;
>>
>> -        vfio_listener_release(container);
>>          QLIST_REMOVE(container, next);
>>
>>          QLIST_FOREACH_SAFE(giommu, &container->giommu_list, giommu_next, tmp) {
>> -            memory_region_unregister_iommu_notifier(&giommu->n);
>>              QLIST_REMOVE(giommu, giommu_next);
>>              g_free(giommu);
>>          }
>
> I'm not spotting why this is a 2-pass process vs simply moving the
> existing QLIST_EMPTY cleanup above the ioctl.  Thanks,





-- 
Alexey

  reply	other threads:[~2016-05-13  7:17 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-04  6:52 [Qemu-devel] [PATCH qemu v16 00/19] spapr: vfio: Enable Dynamic DMA windows (DDW) Alexey Kardashevskiy
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 01/19] vfio: Delay DMA address space listener release Alexey Kardashevskiy
2016-05-05 22:39   ` Alex Williamson
2016-05-13  7:16     ` Alexey Kardashevskiy [this message]
2016-05-13 22:24       ` Alex Williamson
2016-05-25  6:34         ` David Gibson
2016-05-25 13:59           ` Alex Williamson
2016-05-26  1:00             ` David Gibson
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 02/19] memory: Call region_del() callbacks on memory listener unregistering Alexey Kardashevskiy
2016-05-05 22:45   ` Alex Williamson
2016-05-26  1:48     ` David Gibson
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 03/19] memory: Fix IOMMU replay base address Alexey Kardashevskiy
2016-05-26  1:50   ` David Gibson
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 04/19] vmstate: Define VARRAY with VMS_ALLOC Alexey Kardashevskiy
2016-05-27  7:54   ` Alexey Kardashevskiy
2016-06-01  2:29     ` Alexey Kardashevskiy
2016-06-01  8:11       ` Paolo Bonzini
2016-06-02  0:43         ` David Gibson
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 05/19] vfio: Check that IOMMU MR translates to system address space Alexey Kardashevskiy
2016-05-26  1:51   ` David Gibson
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 06/19] spapr_pci: Use correct DMA LIOBN when composing the device tree Alexey Kardashevskiy
2016-05-26  3:17   ` David Gibson
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 07/19] spapr_iommu: Move table allocation to helpers Alexey Kardashevskiy
2016-05-26  3:32   ` David Gibson
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 08/19] spapr_iommu: Introduce "enabled" state for TCE table Alexey Kardashevskiy
2016-05-26  3:39   ` David Gibson
2016-05-27  8:01     ` Alexey Kardashevskiy
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 09/19] spapr_iommu: Finish renaming vfio_accel to need_vfio Alexey Kardashevskiy
2016-05-26  3:18   ` David Gibson
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 10/19] spapr_iommu: Migrate full state Alexey Kardashevskiy
2016-05-26  4:01   ` David Gibson
2016-05-31  8:19     ` Alexey Kardashevskiy
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 11/19] spapr_iommu: Add root memory region Alexey Kardashevskiy
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 12/19] spapr_pci: Reset DMA config on PHB reset Alexey Kardashevskiy
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 13/19] memory: Add reporting of supported page sizes Alexey Kardashevskiy
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 14/19] vfio: spapr: Add DMA memory preregistering (SPAPR IOMMU v2) Alexey Kardashevskiy
2016-05-13 22:25   ` Alex Williamson
2016-05-16  1:10     ` Alexey Kardashevskiy
2016-05-16 20:20       ` Alex Williamson
2016-05-26  4:53         ` David Gibson
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 15/19] spapr_pci: Add and export DMA resetting helper Alexey Kardashevskiy
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 16/19] vfio: Add host side DMA window capabilities Alexey Kardashevskiy
2016-05-13 22:25   ` Alex Williamson
2016-05-27  0:36     ` David Gibson
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 17/19] spapr_iommu, vfio, memory: Notify IOMMU about starting/stopping being used by VFIO Alexey Kardashevskiy
2016-05-13 22:26   ` Alex Williamson
2016-05-16  8:35     ` Alexey Kardashevskiy
2016-05-16 20:13       ` Alex Williamson
2016-05-20  8:04         ` [Qemu-devel] [RFC PATCH qemu] spapr_iommu, vfio, memory: Notify IOMMU about starting/stopping listening Alexey Kardashevskiy
2016-05-20 15:19           ` Alex Williamson
2016-05-27  0:43           ` David Gibson
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 18/19] vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2) Alexey Kardashevskiy
2016-05-13 22:26   ` Alex Williamson
2016-05-16  4:52     ` Alexey Kardashevskiy
2016-05-16 20:20       ` Alex Williamson
2016-05-27  0:50         ` David Gibson
2016-05-27  3:49         ` Alexey Kardashevskiy
2016-05-27  4:05           ` David Gibson
2016-05-04  6:52 ` [Qemu-devel] [PATCH qemu v16 19/19] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) Alexey Kardashevskiy
2016-05-13  8:41   ` Bharata B Rao
2016-05-13  8:49     ` Bharata B Rao
2016-05-16  6:25     ` Alexey Kardashevskiy
2016-05-17  5:32       ` Bharata B Rao
2016-05-27  4:44         ` David Gibson
2016-05-27  5:49           ` Bharata B Rao
2016-06-01  3:32             ` Bharata B Rao
2016-05-27  4:42     ` David Gibson
2016-05-13  4:54 ` [Qemu-devel] [PATCH qemu v16 00/19] spapr: vfio: Enable Dynamic DMA windows (DDW) Alexey Kardashevskiy
2016-05-13  5:36   ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0f2ca845-0c9a-5446-ea69-371ea866319e@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=agraf@suse.de \
    --cc=alex.williamson@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.