All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] vfio-pci: unparent BAR subregions
@ 2015-01-30 23:55 Alex Williamson
  2015-01-31  8:43 ` Paolo Bonzini
  0 siblings, 1 reply; 5+ messages in thread
From: Alex Williamson @ 2015-01-30 23:55 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Alex Williamson, qemu-stable

Commit d8d95814609e replaced a number of memory_region_destroy()
calls with object_unparent() calls.  The logic appears to be that
subregions need to be unparented, but the base region is destroyed
with the device object.  Doing hotplug testing with vfio-pci I
occasionally get a segfault from object_finalize_child_property()
due to completely bogus class pointers on the child Object.  Adding
the explicit object_unparent() for these subregions resolves the
problem, however I question the sanity of the Memory API now where
we sometimes need to destroy MemoryRegions, but the rules aren't
clear and there's no longer a memory_region_destroy() function, so
we need to reach over to some other random QEMU API and unparent
an object that we barely know about and certainly didn't explicitly
parent previously.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
---

 hw/vfio/pci.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 014a92c..c71499e 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2294,10 +2294,12 @@ static void vfio_unmap_bar(VFIOPCIDevice *vdev, int nr)
 
     memory_region_del_subregion(&bar->region.mem, &bar->region.mmap_mem);
     munmap(bar->region.mmap, memory_region_size(&bar->region.mmap_mem));
+    object_unparent(OBJECT(&bar->region.mmap_mem));
 
     if (vdev->msix && vdev->msix->table_bar == nr) {
         memory_region_del_subregion(&bar->region.mem, &vdev->msix->mmap_mem);
         munmap(vdev->msix->mmap, memory_region_size(&vdev->msix->mmap_mem));
+        object_unparent(OBJECT(&vdev->msix->mmap_mem));
     }
 }
 

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] vfio-pci: unparent BAR subregions
  2015-01-30 23:55 [Qemu-devel] [PATCH] vfio-pci: unparent BAR subregions Alex Williamson
@ 2015-01-31  8:43 ` Paolo Bonzini
  2015-01-31 15:10   ` Alex Williamson
  0 siblings, 1 reply; 5+ messages in thread
From: Paolo Bonzini @ 2015-01-31  8:43 UTC (permalink / raw)
  To: Alex Williamson, qemu-devel; +Cc: qemu-stable



On 31/01/2015 00:55, Alex Williamson wrote:
> Commit d8d95814609e replaced a number of memory_region_destroy()
> calls with object_unparent() calls.  The logic appears to be that
> subregions need to be unparented, but the base region is destroyed
> with the device object.  Doing hotplug testing with vfio-pci I
> occasionally get a segfault from object_finalize_child_property()
> due to completely bogus class pointers on the child Object.  Adding
> the explicit object_unparent() for these subregions resolves the
> problem, however I question the sanity of the Memory API now where
> we sometimes need to destroy MemoryRegions, but the rules aren't
> clear

There is no memory_region_destroy API because you cannot destroy
MemoryRegions.  All you do is releasing the link between the VFIO device
(the parent, specified in memory_region_init*) and the MemoryRegion.
The link caused the VFIO device to keep the MemoryRegion alive.

There can be pending references to the VFIO device at unrealize time,
and this is why the memory_region_destroy() API was not enough.  For
example if someone was doing I/O to a BAR and thus address_space_map is
keeping the VFIO device alive.

The explicit memory_region_destroy() function made it much harder to
handle this case.  You had to define an instance_finalize function for
every class, and do memory_region_destroy() there.  Not surprisingly, no
one did that.  Sure, it's not a common case and a well-behaving guest
does not do that, but if it does it means use-after-frees and thus a
possible guest->host escalation.

Instead, the implicit destruction via reference counting makes this case
easy to handle, because reclamation is done automatically when the VFIO
device dies.

Explicit object_unparent() is only needed if you recreate the memory
region during the lifetime of the object.  This is rarely needed, and it
is simple to spot if it's needed.  If you do memory_region_init* outside
the realize function, most likely you need a matching object_unparent
somewhere else in the device logic.

This was the idea behind commit d8d95814609e.  It only touched a handful
of files because almost no one does memory_region_init* outside the
realize function, and in particular VFIO doesn't.  VFIO follows the
common convention of only creating regions in realize, and thus does not
need object_unparent.

> and there's no longer a memory_region_destroy() function, so
> we need to reach over to some other random QEMU API

It's not random.  Object is the parent class of MemoryRegion.
object_unparent is a method for MemoryRegion.

> and unparent an object that we barely know about

I'm not sure about this?  You certainly know the memory regions you create.

> and certainly didn't explicitly parent previously.

You did when you passed the VFIO device to memory_region_init*.

I'm afraid this patch is incorrect.  You have to find out where the
region is being overwritten.

Thanks,

Paolo

> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: qemu-stable@nongnu.org
> ---
> 
>  hw/vfio/pci.c |    2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 014a92c..c71499e 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -2294,10 +2294,12 @@ static void vfio_unmap_bar(VFIOPCIDevice *vdev, int nr)
>  
>      memory_region_del_subregion(&bar->region.mem, &bar->region.mmap_mem);
>      munmap(bar->region.mmap, memory_region_size(&bar->region.mmap_mem));
> +    object_unparent(OBJECT(&bar->region.mmap_mem));
>  
>      if (vdev->msix && vdev->msix->table_bar == nr) {
>          memory_region_del_subregion(&bar->region.mem, &vdev->msix->mmap_mem);
>          munmap(vdev->msix->mmap, memory_region_size(&vdev->msix->mmap_mem));
> +        object_unparent(OBJECT(&vdev->msix->mmap_mem));
>      }
>  }
>  
> 
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] vfio-pci: unparent BAR subregions
  2015-01-31  8:43 ` Paolo Bonzini
@ 2015-01-31 15:10   ` Alex Williamson
  2015-01-31 20:47     ` Paolo Bonzini
  0 siblings, 1 reply; 5+ messages in thread
From: Alex Williamson @ 2015-01-31 15:10 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, qemu-stable

On Sat, 2015-01-31 at 09:43 +0100, Paolo Bonzini wrote:
> 
> On 31/01/2015 00:55, Alex Williamson wrote:
> > Commit d8d95814609e replaced a number of memory_region_destroy()
> > calls with object_unparent() calls.  The logic appears to be that
> > subregions need to be unparented, but the base region is destroyed
> > with the device object.  Doing hotplug testing with vfio-pci I
> > occasionally get a segfault from object_finalize_child_property()
> > due to completely bogus class pointers on the child Object.  Adding
> > the explicit object_unparent() for these subregions resolves the
> > problem, however I question the sanity of the Memory API now where
> > we sometimes need to destroy MemoryRegions, but the rules aren't
> > clear
> 
> There is no memory_region_destroy API because you cannot destroy
> MemoryRegions.  All you do is releasing the link between the VFIO device
> (the parent, specified in memory_region_init*) and the MemoryRegion.
> The link caused the VFIO device to keep the MemoryRegion alive.
> 
> There can be pending references to the VFIO device at unrealize time,
> and this is why the memory_region_destroy() API was not enough.  For
> example if someone was doing I/O to a BAR and thus address_space_map is
> keeping the VFIO device alive.
> 
> The explicit memory_region_destroy() function made it much harder to
> handle this case.  You had to define an instance_finalize function for
> every class, and do memory_region_destroy() there.  Not surprisingly, no
> one did that.  Sure, it's not a common case and a well-behaving guest
> does not do that, but if it does it means use-after-frees and thus a
> possible guest->host escalation.
> 
> Instead, the implicit destruction via reference counting makes this case
> easy to handle, because reclamation is done automatically when the VFIO
> device dies.
> 
> Explicit object_unparent() is only needed if you recreate the memory
> region during the lifetime of the object.  This is rarely needed, and it
> is simple to spot if it's needed.  If you do memory_region_init* outside
> the realize function, most likely you need a matching object_unparent
> somewhere else in the device logic.
> 
> This was the idea behind commit d8d95814609e.  It only touched a handful
> of files because almost no one does memory_region_init* outside the
> realize function, and in particular VFIO doesn't.  VFIO follows the
> common convention of only creating regions in realize, and thus does not
> need object_unparent.
> 
> > and there's no longer a memory_region_destroy() function, so
> > we need to reach over to some other random QEMU API
> 
> It's not random.  Object is the parent class of MemoryRegion.
> object_unparent is a method for MemoryRegion.
> 
> > and unparent an object that we barely know about
> 
> I'm not sure about this?  You certainly know the memory regions you create.
> 
> > and certainly didn't explicitly parent previously.
> 
> You did when you passed the VFIO device to memory_region_init*.
> 
> I'm afraid this patch is incorrect.  You have to find out where the
> region is being overwritten.

Thanks Paolo, so if I look more closely at where you added
object_unparent() calls in d8d95814609e, I can see that they're
associated with dynamically allocated objects that are freed as part of
the vfio device exitfn.  vdev->msix is also such a structure and is the
property causing us the segfaults.  Being associated with a free also
explains the randomness of the segfault.  So, I think the second
object_unparent() call is correct and that the guiding principle is that
any MemoryRegion associated with a dynamically allocated structure and
freed as part of the class exit callback needs to be explicitly
unparented.  Does that sound right?  Thanks,

Alex

> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: qemu-stable@nongnu.org
> > ---
> > 
> >  hw/vfio/pci.c |    2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > index 014a92c..c71499e 100644
> > --- a/hw/vfio/pci.c
> > +++ b/hw/vfio/pci.c
> > @@ -2294,10 +2294,12 @@ static void vfio_unmap_bar(VFIOPCIDevice *vdev, int nr)
> >  
> >      memory_region_del_subregion(&bar->region.mem, &bar->region.mmap_mem);
> >      munmap(bar->region.mmap, memory_region_size(&bar->region.mmap_mem));
> > +    object_unparent(OBJECT(&bar->region.mmap_mem));
> >  
> >      if (vdev->msix && vdev->msix->table_bar == nr) {
> >          memory_region_del_subregion(&bar->region.mem, &vdev->msix->mmap_mem);
> >          munmap(vdev->msix->mmap, memory_region_size(&vdev->msix->mmap_mem));
> > +        object_unparent(OBJECT(&vdev->msix->mmap_mem));
> >      }
> >  }
> >  
> > 
> > 
> > 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] vfio-pci: unparent BAR subregions
  2015-01-31 15:10   ` Alex Williamson
@ 2015-01-31 20:47     ` Paolo Bonzini
  2015-02-01 16:14       ` Alex Williamson
  0 siblings, 1 reply; 5+ messages in thread
From: Paolo Bonzini @ 2015-01-31 20:47 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel, qemu-stable



On 31/01/2015 16:10, Alex Williamson wrote:
>> Explicit object_unparent() is only needed if you recreate the memory
>> region during the lifetime of the object.  This is rarely needed, and it
>> is simple to spot if it's needed.  If you do memory_region_init* outside
>> the realize function, most likely you need a matching object_unparent
>> somewhere else in the device logic.
>>
>> This was the idea behind commit d8d95814609e.  It only touched a handful
>> of files because almost no one does memory_region_init* outside the
>> realize function, and in particular VFIO doesn't.  VFIO follows the
>> common convention of only creating regions in realize, and thus does not
>> need object_unparent.
> 
> Thanks Paolo, so if I look more closely at where you added
> object_unparent() calls in d8d95814609e, I can see that they're
> associated with dynamically allocated objects that are freed as part of
> the vfio device exitfn.  vdev->msix is also such a structure and is the
> property causing us the segfaults.

Yeah, this is a different case than the one I mentioned above, and
you're right that this is also not exactly the common convention---so it
is a second case of needing an explicit object_unparent.

> So, I think the second
> object_unparent() call is correct and that the guiding principle is that
> any MemoryRegion associated with a dynamically allocated structure and
> freed as part of the class exit callback needs to be explicitly
> unparented.  Does that sound right?

The pedant in me asks to do the object_unparent in vfio_put_device, just
before freeing vdev->msix.  This way you already abide by RCU's
principle of separating removal and reclamation (and I won't have to
move it in part 3 of the RCU patches, which is the next to be posted;
that part will move the vfio_put_device call to instance_finalize).

Another possible change would be to make vdev->msix statically allocated
(checking vdev->msix.entries != 0 instead of vdev->msix != NULL,
possibly hidden beneath an inline function vfio_has_msix).  Then you'd
be in the exact same case as the other memory regions and wouldn't need
an unparent at all.  But I am not certain myself of the relative beauty
of this, and it is obviously less suitable for qemu-stable, so do go
ahead with the one-liner.

I'll improve the documentation as soon as the code actually follows the
principles that have to be documented.  But now that the ball is rolling
on RCU and multithreading, documentation is indeed getting more and more
important.

Thanks,

Paolo

> Alex
> 
>>> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>> Cc: qemu-stable@nongnu.org
>>> ---
>>>
>>>  hw/vfio/pci.c |    2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>> index 014a92c..c71499e 100644
>>> --- a/hw/vfio/pci.c
>>> +++ b/hw/vfio/pci.c
>>> @@ -2294,10 +2294,12 @@ static void vfio_unmap_bar(VFIOPCIDevice *vdev, int nr)
>>>  
>>>      memory_region_del_subregion(&bar->region.mem, &bar->region.mmap_mem);
>>>      munmap(bar->region.mmap, memory_region_size(&bar->region.mmap_mem));
>>> +    object_unparent(OBJECT(&bar->region.mmap_mem));
>>>  
>>>      if (vdev->msix && vdev->msix->table_bar == nr) {
>>>          memory_region_del_subregion(&bar->region.mem, &vdev->msix->mmap_mem);
>>>          munmap(vdev->msix->mmap, memory_region_size(&vdev->msix->mmap_mem));
>>> +        object_unparent(OBJECT(&vdev->msix->mmap_mem));
>>>      }
>>>  }
>>>  
>>>
>>>
>>>
> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] vfio-pci: unparent BAR subregions
  2015-01-31 20:47     ` Paolo Bonzini
@ 2015-02-01 16:14       ` Alex Williamson
  0 siblings, 0 replies; 5+ messages in thread
From: Alex Williamson @ 2015-02-01 16:14 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, qemu-stable

On Sat, 2015-01-31 at 21:47 +0100, Paolo Bonzini wrote:
> 
> On 31/01/2015 16:10, Alex Williamson wrote:
> >> Explicit object_unparent() is only needed if you recreate the memory
> >> region during the lifetime of the object.  This is rarely needed, and it
> >> is simple to spot if it's needed.  If you do memory_region_init* outside
> >> the realize function, most likely you need a matching object_unparent
> >> somewhere else in the device logic.
> >>
> >> This was the idea behind commit d8d95814609e.  It only touched a handful
> >> of files because almost no one does memory_region_init* outside the
> >> realize function, and in particular VFIO doesn't.  VFIO follows the
> >> common convention of only creating regions in realize, and thus does not
> >> need object_unparent.
> > 
> > Thanks Paolo, so if I look more closely at where you added
> > object_unparent() calls in d8d95814609e, I can see that they're
> > associated with dynamically allocated objects that are freed as part of
> > the vfio device exitfn.  vdev->msix is also such a structure and is the
> > property causing us the segfaults.
> 
> Yeah, this is a different case than the one I mentioned above, and
> you're right that this is also not exactly the common convention---so it
> is a second case of needing an explicit object_unparent.
> 
> > So, I think the second
> > object_unparent() call is correct and that the guiding principle is that
> > any MemoryRegion associated with a dynamically allocated structure and
> > freed as part of the class exit callback needs to be explicitly
> > unparented.  Does that sound right?
> 
> The pedant in me asks to do the object_unparent in vfio_put_device, just
> before freeing vdev->msix.  This way you already abide by RCU's
> principle of separating removal and reclamation (and I won't have to
> move it in part 3 of the RCU patches, which is the next to be posted;
> that part will move the vfio_put_device call to instance_finalize).

Ok.

> Another possible change would be to make vdev->msix statically allocated
> (checking vdev->msix.entries != 0 instead of vdev->msix != NULL,
> possibly hidden beneath an inline function vfio_has_msix).  Then you'd
> be in the exact same case as the other memory regions and wouldn't need
> an unparent at all.  But I am not certain myself of the relative beauty
> of this, and it is obviously less suitable for qemu-stable, so do go
> ahead with the one-liner.

The pendant in me prefers that the MSI-X structure is only allocated for
devices that actually have MSI-X ;)  v2 posted.

> I'll improve the documentation as soon as the code actually follows the
> principles that have to be documented.  But now that the ball is rolling
> on RCU and multithreading, documentation is indeed getting more and more
> important.

Thanks, it's definitely not obvious how this works in the current docs.

Alex

> >>> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> >>> Cc: Paolo Bonzini <pbonzini@redhat.com>
> >>> Cc: qemu-stable@nongnu.org
> >>> ---
> >>>
> >>>  hw/vfio/pci.c |    2 ++
> >>>  1 file changed, 2 insertions(+)
> >>>
> >>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> >>> index 014a92c..c71499e 100644
> >>> --- a/hw/vfio/pci.c
> >>> +++ b/hw/vfio/pci.c
> >>> @@ -2294,10 +2294,12 @@ static void vfio_unmap_bar(VFIOPCIDevice *vdev, int nr)
> >>>  
> >>>      memory_region_del_subregion(&bar->region.mem, &bar->region.mmap_mem);
> >>>      munmap(bar->region.mmap, memory_region_size(&bar->region.mmap_mem));
> >>> +    object_unparent(OBJECT(&bar->region.mmap_mem));
> >>>  
> >>>      if (vdev->msix && vdev->msix->table_bar == nr) {
> >>>          memory_region_del_subregion(&bar->region.mem, &vdev->msix->mmap_mem);
> >>>          munmap(vdev->msix->mmap, memory_region_size(&vdev->msix->mmap_mem));
> >>> +        object_unparent(OBJECT(&vdev->msix->mmap_mem));
> >>>      }
> >>>  }
> >>>  
> >>>
> >>>
> >>>
> > 
> > 
> > 
> > 
> > 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-02-01 16:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-30 23:55 [Qemu-devel] [PATCH] vfio-pci: unparent BAR subregions Alex Williamson
2015-01-31  8:43 ` Paolo Bonzini
2015-01-31 15:10   ` Alex Williamson
2015-01-31 20:47     ` Paolo Bonzini
2015-02-01 16:14       ` Alex Williamson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.