All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] exec.c: check RAMBlock validity before changing its flag
@ 2018-07-01 17:19 Cédric Le Goater
  2018-07-02  3:57 ` Peter Xu
  0 siblings, 1 reply; 10+ messages in thread
From: Cédric Le Goater @ 2018-07-01 17:19 UTC (permalink / raw)
  To: qemu-devel, Dr . David Alan Gilbert
  Cc: Peter Maydell, Paolo Bonzini, Alex Williamson, David Gibson,
	Michael S. Tsirkin, Marcel Apfelbaum, Cédric Le Goater

When a PCI device is unplugged, the PCI memory regions are deleted
before the optional ROM RAMBlock is flagged non-migratable. But, when
this is done, the RAMBlock has already been cleared from the region,
leading to a segv.

Fix the issue by testing the RAMBlock before flagging it, as it is
done in qemu_ram_unset_idstr()

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---

 I caught this bug while deleting a passthrough device from a pseries
 machine. Here is the stack:
   
   #0  qemu_ram_unset_migratable (rb=0x0) at /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994
   #1  0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, dev=<optimized out>)
   #2  0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330)
   #3  pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>)
   #4  0x00000001005ff910 in device_set_realized (obj=0x101796330, value=<optimized out>, errp=0x0)
   #5  0x00000001007a487c in property_set_bool (obj=0x101796330, v=<optimized out>, name=<optimized out>, 
   #6  0x00000001007a7878 in object_property_set (obj=0x101796330, v=0x7fff70033110, 
   #7  0x00000001007aaf1c in object_property_set_qobject (obj=0x101796330, value=<optimized out>, 
   #8  0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, value=<optimized out>, 
   #9  0x00000001005fcdd8 in device_unparent (obj=0x101796330)
   #10 0x00000001007a6dd0 in object_finalize_child_property (obj=<optimized out>, name=<optimized out>, 
   #11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, child=0x101796330, 
   #12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb (dev=0x101796330)
   #13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0)
   #14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0)
   #15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0)
   #16 0x000000010042a50c in rtas_set_isolation_state (state=0, idx=<optimized out>)

 May be we should call pci_del_option_rom() before
 pci_unregister_io_regions() ?

 exec.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/exec.c b/exec.c
index ebadc0e302ad..0dabf7380438 100644
--- a/exec.c
+++ b/exec.c
@@ -1992,7 +1992,9 @@ void qemu_ram_set_migratable(RAMBlock *rb)
 
 void qemu_ram_unset_migratable(RAMBlock *rb)
 {
-    rb->flags &= ~RAM_MIGRATABLE;
+    if (rb) {
+        rb->flags &= ~RAM_MIGRATABLE;
+    }
 }
 
 /* Called with iothread lock held.  */
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] exec.c: check RAMBlock validity before changing its flag
  2018-07-01 17:19 [Qemu-devel] [PATCH] exec.c: check RAMBlock validity before changing its flag Cédric Le Goater
@ 2018-07-02  3:57 ` Peter Xu
  2018-07-03 12:45   ` Cédric Le Goater
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Xu @ 2018-07-02  3:57 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-devel, Dr . David Alan Gilbert, Peter Maydell,
	Michael S. Tsirkin, Alex Williamson, Paolo Bonzini, David Gibson

On Sun, Jul 01, 2018 at 07:19:53PM +0200, Cédric Le Goater wrote:
> When a PCI device is unplugged, the PCI memory regions are deleted
> before the optional ROM RAMBlock is flagged non-migratable. But, when
> this is done, the RAMBlock has already been cleared from the region,
> leading to a segv.
> 
> Fix the issue by testing the RAMBlock before flagging it, as it is
> done in qemu_ram_unset_idstr()
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
> 
>  I caught this bug while deleting a passthrough device from a pseries
>  machine. Here is the stack:
>    
>    #0  qemu_ram_unset_migratable (rb=0x0) at /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994
>    #1  0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, dev=<optimized out>)
>    #2  0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330)
>    #3  pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>)
>    #4  0x00000001005ff910 in device_set_realized (obj=0x101796330, value=<optimized out>, errp=0x0)
>    #5  0x00000001007a487c in property_set_bool (obj=0x101796330, v=<optimized out>, name=<optimized out>, 
>    #6  0x00000001007a7878 in object_property_set (obj=0x101796330, v=0x7fff70033110, 
>    #7  0x00000001007aaf1c in object_property_set_qobject (obj=0x101796330, value=<optimized out>, 
>    #8  0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, value=<optimized out>, 
>    #9  0x00000001005fcdd8 in device_unparent (obj=0x101796330)
>    #10 0x00000001007a6dd0 in object_finalize_child_property (obj=<optimized out>, name=<optimized out>, 
>    #11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, child=0x101796330, 
>    #12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb (dev=0x101796330)
>    #13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0)
>    #14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0)
>    #15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0)
>    #16 0x000000010042a50c in rtas_set_isolation_state (state=0, idx=<optimized out>)
> 
>  May be we should call pci_del_option_rom() before
>  pci_unregister_io_regions() ?

This seems to make more sense to me.

Meanwhile I assume the name pci_del_option_rom() is a bit misleading -
it's not really deleting the ROM but unregistering the ROM only.
Instead IIUC it's pci_unregister_io_regions() which deleted that. So
maybe we can either rename the function pci_del_option_rom(), or we
can pick the ROM destruction out of pci_unregister_io_regions() and
put it into pci_del_option_rom() to make sure it's done as the last
step?

Regards,

> 
>  exec.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/exec.c b/exec.c
> index ebadc0e302ad..0dabf7380438 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1992,7 +1992,9 @@ void qemu_ram_set_migratable(RAMBlock *rb)
>  
>  void qemu_ram_unset_migratable(RAMBlock *rb)
>  {
> -    rb->flags &= ~RAM_MIGRATABLE;
> +    if (rb) {
> +        rb->flags &= ~RAM_MIGRATABLE;
> +    }
>  }
>  
>  /* Called with iothread lock held.  */
> -- 
> 2.13.6
> 
> 

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] exec.c: check RAMBlock validity before changing its flag
  2018-07-02  3:57 ` Peter Xu
@ 2018-07-03 12:45   ` Cédric Le Goater
  2018-07-04  2:26     ` Peter Xu
  0 siblings, 1 reply; 10+ messages in thread
From: Cédric Le Goater @ 2018-07-03 12:45 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Dr . David Alan Gilbert, Peter Maydell,
	Michael S. Tsirkin, Alex Williamson, Paolo Bonzini, David Gibson

On 07/02/2018 05:57 AM, Peter Xu wrote:
> On Sun, Jul 01, 2018 at 07:19:53PM +0200, Cédric Le Goater wrote:
>> When a PCI device is unplugged, the PCI memory regions are deleted
>> before the optional ROM RAMBlock is flagged non-migratable. But, when
>> this is done, the RAMBlock has already been cleared from the region,
>> leading to a segv.
>>
>> Fix the issue by testing the RAMBlock before flagging it, as it is
>> done in qemu_ram_unset_idstr()
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>
>>  I caught this bug while deleting a passthrough device from a pseries
>>  machine. Here is the stack:
>>    
>>    #0  qemu_ram_unset_migratable (rb=0x0) at /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994
>>    #1  0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, dev=<optimized out>)
>>    #2  0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330)
>>    #3  pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>)
>>    #4  0x00000001005ff910 in device_set_realized (obj=0x101796330, value=<optimized out>, errp=0x0)
>>    #5  0x00000001007a487c in property_set_bool (obj=0x101796330, v=<optimized out>, name=<optimized out>, 
>>    #6  0x00000001007a7878 in object_property_set (obj=0x101796330, v=0x7fff70033110, 
>>    #7  0x00000001007aaf1c in object_property_set_qobject (obj=0x101796330, value=<optimized out>, 
>>    #8  0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, value=<optimized out>, 
>>    #9  0x00000001005fcdd8 in device_unparent (obj=0x101796330)
>>    #10 0x00000001007a6dd0 in object_finalize_child_property (obj=<optimized out>, name=<optimized out>, 
>>    #11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, child=0x101796330, 
>>    #12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb (dev=0x101796330)
>>    #13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0)
>>    #14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0)
>>    #15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0)
>>    #16 0x000000010042a50c in rtas_set_isolation_state (state=0, idx=<optimized out>)
>>
>>  May be we should call pci_del_option_rom() before
>>  pci_unregister_io_regions() ?
> 
> This seems to make more sense to me.
> 
> Meanwhile I assume the name pci_del_option_rom() is a bit misleading -
> it's not really deleting the ROM but unregistering the ROM only.
> Instead IIUC it's pci_unregister_io_regions() which deleted that. So
> maybe we can either rename the function pci_del_option_rom(), or we
> can pick the ROM destruction out of pci_unregister_io_regions() and
> put it into pci_del_option_rom() to make sure it's done as the last
> step?

So it is a little more complex than I thought. 

The PCI device is a vfio PCI device and the PCI ROM region is initialized 
in vfio_pci_size_rom() with memory_region_init_io(), which does not 
allocate the RAMBlock, but has_rom is still set to true. 

When the device is deleted, pci_del_option_rom() is called and with it, vmstate_unregister_ram() because has_rom is set to true. Leading to the
SEGV.

I am not sure how to handle this case. It seems that the realize routine 
of VFIOPCIDevice is hijacking a little the PCIDevice layer.


C.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] exec.c: check RAMBlock validity before changing its flag
  2018-07-03 12:45   ` Cédric Le Goater
@ 2018-07-04  2:26     ` Peter Xu
  2018-07-04  6:42       ` Cédric Le Goater
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Xu @ 2018-07-04  2:26 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: qemu-devel, Dr . David Alan Gilbert, Peter Maydell,
	Michael S. Tsirkin, Alex Williamson, Paolo Bonzini, David Gibson

On Tue, Jul 03, 2018 at 02:45:24PM +0200, Cédric Le Goater wrote:
> On 07/02/2018 05:57 AM, Peter Xu wrote:
> > On Sun, Jul 01, 2018 at 07:19:53PM +0200, Cédric Le Goater wrote:
> >> When a PCI device is unplugged, the PCI memory regions are deleted
> >> before the optional ROM RAMBlock is flagged non-migratable. But, when
> >> this is done, the RAMBlock has already been cleared from the region,
> >> leading to a segv.
> >>
> >> Fix the issue by testing the RAMBlock before flagging it, as it is
> >> done in qemu_ram_unset_idstr()
> >>
> >> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >> ---
> >>
> >>  I caught this bug while deleting a passthrough device from a pseries
> >>  machine. Here is the stack:
> >>    
> >>    #0  qemu_ram_unset_migratable (rb=0x0) at /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994
> >>    #1  0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, dev=<optimized out>)
> >>    #2  0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330)
> >>    #3  pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>)
> >>    #4  0x00000001005ff910 in device_set_realized (obj=0x101796330, value=<optimized out>, errp=0x0)
> >>    #5  0x00000001007a487c in property_set_bool (obj=0x101796330, v=<optimized out>, name=<optimized out>, 
> >>    #6  0x00000001007a7878 in object_property_set (obj=0x101796330, v=0x7fff70033110, 
> >>    #7  0x00000001007aaf1c in object_property_set_qobject (obj=0x101796330, value=<optimized out>, 
> >>    #8  0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, value=<optimized out>, 
> >>    #9  0x00000001005fcdd8 in device_unparent (obj=0x101796330)
> >>    #10 0x00000001007a6dd0 in object_finalize_child_property (obj=<optimized out>, name=<optimized out>, 
> >>    #11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, child=0x101796330, 
> >>    #12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb (dev=0x101796330)
> >>    #13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0)
> >>    #14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0)
> >>    #15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0)
> >>    #16 0x000000010042a50c in rtas_set_isolation_state (state=0, idx=<optimized out>)
> >>
> >>  May be we should call pci_del_option_rom() before
> >>  pci_unregister_io_regions() ?
> > 
> > This seems to make more sense to me.
> > 
> > Meanwhile I assume the name pci_del_option_rom() is a bit misleading -
> > it's not really deleting the ROM but unregistering the ROM only.
> > Instead IIUC it's pci_unregister_io_regions() which deleted that. So
> > maybe we can either rename the function pci_del_option_rom(), or we
> > can pick the ROM destruction out of pci_unregister_io_regions() and
> > put it into pci_del_option_rom() to make sure it's done as the last
> > step?
> 
> So it is a little more complex than I thought. 
> 
> The PCI device is a vfio PCI device and the PCI ROM region is initialized 
> in vfio_pci_size_rom() with memory_region_init_io(), which does not 
> allocate the RAMBlock, but has_rom is still set to true. 
> 
> When the device is deleted, pci_del_option_rom() is called and with it, vmstate_unregister_ram() because has_rom is set to true. Leading to the
> SEGV.
> 
> I am not sure how to handle this case. It seems that the realize routine 
> of VFIOPCIDevice is hijacking a little the PCIDevice layer.

Indeed.

Then now I'm a bit confused on who actually deleted the ROM memory
region that was created when pci_add_option_rom() was called.  It
seems to be leaked.

AFAIU the rest of the memory regions of the BARs (0-5) are managed by
specific device emulation code, however this ROM memory region is
managed by PCI subsystem.  Not sure whether that means we should
destroy the region in PCI subsystem too, e.g. in pci_del_option_rom().

And now I see this patch might be a valid fix for the VFIO-specific
issue (though we might comment that a bit somewhere).

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] exec.c: check RAMBlock validity before changing its flag
  2018-07-04  2:26     ` Peter Xu
@ 2018-07-04  6:42       ` Cédric Le Goater
  2018-07-04  9:34         ` Paolo Bonzini
  0 siblings, 1 reply; 10+ messages in thread
From: Cédric Le Goater @ 2018-07-04  6:42 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Dr . David Alan Gilbert, Peter Maydell,
	Michael S. Tsirkin, Alex Williamson, Paolo Bonzini, David Gibson

On 07/04/2018 04:26 AM, Peter Xu wrote:
> On Tue, Jul 03, 2018 at 02:45:24PM +0200, Cédric Le Goater wrote:
>> On 07/02/2018 05:57 AM, Peter Xu wrote:
>>> On Sun, Jul 01, 2018 at 07:19:53PM +0200, Cédric Le Goater wrote:
>>>> When a PCI device is unplugged, the PCI memory regions are deleted
>>>> before the optional ROM RAMBlock is flagged non-migratable. But, when
>>>> this is done, the RAMBlock has already been cleared from the region,
>>>> leading to a segv.
>>>>
>>>> Fix the issue by testing the RAMBlock before flagging it, as it is
>>>> done in qemu_ram_unset_idstr()
>>>>
>>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>>>> ---
>>>>
>>>>  I caught this bug while deleting a passthrough device from a pseries
>>>>  machine. Here is the stack:
>>>>    
>>>>    #0  qemu_ram_unset_migratable (rb=0x0) at /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994
>>>>    #1  0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, dev=<optimized out>)
>>>>    #2  0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330)
>>>>    #3  pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>)
>>>>    #4  0x00000001005ff910 in device_set_realized (obj=0x101796330, value=<optimized out>, errp=0x0)
>>>>    #5  0x00000001007a487c in property_set_bool (obj=0x101796330, v=<optimized out>, name=<optimized out>, 
>>>>    #6  0x00000001007a7878 in object_property_set (obj=0x101796330, v=0x7fff70033110, 
>>>>    #7  0x00000001007aaf1c in object_property_set_qobject (obj=0x101796330, value=<optimized out>, 
>>>>    #8  0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, value=<optimized out>, 
>>>>    #9  0x00000001005fcdd8 in device_unparent (obj=0x101796330)
>>>>    #10 0x00000001007a6dd0 in object_finalize_child_property (obj=<optimized out>, name=<optimized out>, 
>>>>    #11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, child=0x101796330, 
>>>>    #12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb (dev=0x101796330)
>>>>    #13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0)
>>>>    #14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0)
>>>>    #15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0)
>>>>    #16 0x000000010042a50c in rtas_set_isolation_state (state=0, idx=<optimized out>)
>>>>
>>>>  May be we should call pci_del_option_rom() before
>>>>  pci_unregister_io_regions() ?
>>>
>>> This seems to make more sense to me.
>>>
>>> Meanwhile I assume the name pci_del_option_rom() is a bit misleading -
>>> it's not really deleting the ROM but unregistering the ROM only.
>>> Instead IIUC it's pci_unregister_io_regions() which deleted that. So
>>> maybe we can either rename the function pci_del_option_rom(), or we
>>> can pick the ROM destruction out of pci_unregister_io_regions() and
>>> put it into pci_del_option_rom() to make sure it's done as the last
>>> step?
>>
>> So it is a little more complex than I thought. 
>>
>> The PCI device is a vfio PCI device and the PCI ROM region is initialized 
>> in vfio_pci_size_rom() with memory_region_init_io(), which does not 
>> allocate the RAMBlock, but has_rom is still set to true. 
>>
>> When the device is deleted, pci_del_option_rom() is called and with it, vmstate_unregister_ram() because has_rom is set to true. Leading to the
>> SEGV.
>>
>> I am not sure how to handle this case. It seems that the realize routine 
>> of VFIOPCIDevice is hijacking a little the PCIDevice layer.
> 
> Indeed.
> 
> Then now I'm a bit confused on who actually deleted the ROM memory
> region that was created when pci_add_option_rom() was called.  It
> seems to be leaked.
> 
> AFAIU the rest of the memory regions of the BARs (0-5) are managed by
> specific device emulation code, however this ROM memory region is
> managed by PCI subsystem.  Not sure whether that means we should
> destroy the region in PCI subsystem too, e.g. in pci_del_option_rom().
> 
> And now I see this patch might be a valid fix for the VFIO-specific
> issue (though we might comment that a bit somewhere).

yes. I will send a v2 with an updated commit log.

Thanks,

C.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] exec.c: check RAMBlock validity before changing its flag
  2018-07-04  6:42       ` Cédric Le Goater
@ 2018-07-04  9:34         ` Paolo Bonzini
  2018-07-04  9:55           ` Peter Xu
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2018-07-04  9:34 UTC (permalink / raw)
  To: Cédric Le Goater, Peter Xu
  Cc: qemu-devel, Dr . David Alan Gilbert, Peter Maydell,
	Michael S. Tsirkin, Alex Williamson, David Gibson

On 04/07/2018 08:42, Cédric Le Goater wrote:
> On 07/04/2018 04:26 AM, Peter Xu wrote:
>> On Tue, Jul 03, 2018 at 02:45:24PM +0200, Cédric Le Goater wrote:
>>> On 07/02/2018 05:57 AM, Peter Xu wrote:
>>>> On Sun, Jul 01, 2018 at 07:19:53PM +0200, Cédric Le Goater wrote:
>>>>> When a PCI device is unplugged, the PCI memory regions are deleted
>>>>> before the optional ROM RAMBlock is flagged non-migratable. But, when
>>>>> this is done, the RAMBlock has already been cleared from the region,
>>>>> leading to a segv.
>>>>>
>>>>> Fix the issue by testing the RAMBlock before flagging it, as it is
>>>>> done in qemu_ram_unset_idstr()
>>>>>
>>>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>>>>> ---
>>>>>
>>>>>  I caught this bug while deleting a passthrough device from a pseries
>>>>>  machine. Here is the stack:
>>>>>    
>>>>>    #0  qemu_ram_unset_migratable (rb=0x0) at /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994
>>>>>    #1  0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, dev=<optimized out>)
>>>>>    #2  0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330)
>>>>>    #3  pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>)
>>>>>    #4  0x00000001005ff910 in device_set_realized (obj=0x101796330, value=<optimized out>, errp=0x0)
>>>>>    #5  0x00000001007a487c in property_set_bool (obj=0x101796330, v=<optimized out>, name=<optimized out>, 
>>>>>    #6  0x00000001007a7878 in object_property_set (obj=0x101796330, v=0x7fff70033110, 
>>>>>    #7  0x00000001007aaf1c in object_property_set_qobject (obj=0x101796330, value=<optimized out>, 
>>>>>    #8  0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, value=<optimized out>, 
>>>>>    #9  0x00000001005fcdd8 in device_unparent (obj=0x101796330)
>>>>>    #10 0x00000001007a6dd0 in object_finalize_child_property (obj=<optimized out>, name=<optimized out>, 
>>>>>    #11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, child=0x101796330, 
>>>>>    #12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb (dev=0x101796330)
>>>>>    #13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0)
>>>>>    #14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0)
>>>>>    #15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0)
>>>>>    #16 0x000000010042a50c in rtas_set_isolation_state (state=0, idx=<optimized out>)
>>>>>
>>>>>  May be we should call pci_del_option_rom() before
>>>>>  pci_unregister_io_regions() ?
>>>>
>>>> This seems to make more sense to me.
>>>>
>>>> Meanwhile I assume the name pci_del_option_rom() is a bit misleading -
>>>> it's not really deleting the ROM but unregistering the ROM only.
>>>> Instead IIUC it's pci_unregister_io_regions() which deleted that. So
>>>> maybe we can either rename the function pci_del_option_rom(), or we
>>>> can pick the ROM destruction out of pci_unregister_io_regions() and
>>>> put it into pci_del_option_rom() to make sure it's done as the last
>>>> step?
>>>
>>> So it is a little more complex than I thought. 
>>>
>>> The PCI device is a vfio PCI device and the PCI ROM region is initialized 
>>> in vfio_pci_size_rom() with memory_region_init_io(), which does not 
>>> allocate the RAMBlock, but has_rom is still set to true. 
>>>
>>> When the device is deleted, pci_del_option_rom() is called and with it, vmstate_unregister_ram() because has_rom is set to true. Leading to the
>>> SEGV.
>>>
>>> I am not sure how to handle this case. It seems that the realize routine 
>>> of VFIOPCIDevice is hijacking a little the PCIDevice layer.
>>
>> Indeed.
>>
>> Then now I'm a bit confused on who actually deleted the ROM memory
>> region that was created when pci_add_option_rom() was called.  It
>> seems to be leaked.
>>
>> AFAIU the rest of the memory regions of the BARs (0-5) are managed by
>> specific device emulation code, however this ROM memory region is
>> managed by PCI subsystem.  Not sure whether that means we should
>> destroy the region in PCI subsystem too, e.g. in pci_del_option_rom().
>>
>> And now I see this patch might be a valid fix for the VFIO-specific
>> issue (though we might comment that a bit somewhere).
> 
> yes. I will send a v2 with an updated commit log.

I wonder if the fix is simply to... get rid of vmstate_unregister_ram.

It was added in

    commit b0e56e0b63f350691b52d3e75e89bb64143fbeff
    Author: Hu Tao <hutao@cn.fujitsu.com>
    Date:   Wed Apr 2 15:13:27 2014 +0800

    unset RAMBlock idstr when unregister MemoryRegion

    Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

whose commit message is a bit lacking, but
http://lists.gnu.org/archive/html/qemu-devel/2014-04/msg00282.html helps
more.  It seems like the original bug was a reference count issue.

Clearing the new migratable flag should also be unnecessary.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] exec.c: check RAMBlock validity before changing its flag
  2018-07-04  9:34         ` Paolo Bonzini
@ 2018-07-04  9:55           ` Peter Xu
  2018-07-04 12:16             ` Paolo Bonzini
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Xu @ 2018-07-04  9:55 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Cédric Le Goater, qemu-devel, Dr . David Alan Gilbert,
	Peter Maydell, Michael S. Tsirkin, Alex Williamson, David Gibson

On Wed, Jul 04, 2018 at 11:34:55AM +0200, Paolo Bonzini wrote:
> On 04/07/2018 08:42, Cédric Le Goater wrote:
> > On 07/04/2018 04:26 AM, Peter Xu wrote:
> >> On Tue, Jul 03, 2018 at 02:45:24PM +0200, Cédric Le Goater wrote:
> >>> On 07/02/2018 05:57 AM, Peter Xu wrote:
> >>>> On Sun, Jul 01, 2018 at 07:19:53PM +0200, Cédric Le Goater wrote:
> >>>>> When a PCI device is unplugged, the PCI memory regions are deleted
> >>>>> before the optional ROM RAMBlock is flagged non-migratable. But, when
> >>>>> this is done, the RAMBlock has already been cleared from the region,
> >>>>> leading to a segv.
> >>>>>
> >>>>> Fix the issue by testing the RAMBlock before flagging it, as it is
> >>>>> done in qemu_ram_unset_idstr()
> >>>>>
> >>>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >>>>> ---
> >>>>>
> >>>>>  I caught this bug while deleting a passthrough device from a pseries
> >>>>>  machine. Here is the stack:
> >>>>>    
> >>>>>    #0  qemu_ram_unset_migratable (rb=0x0) at /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994
> >>>>>    #1  0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, dev=<optimized out>)
> >>>>>    #2  0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330)
> >>>>>    #3  pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>)
> >>>>>    #4  0x00000001005ff910 in device_set_realized (obj=0x101796330, value=<optimized out>, errp=0x0)
> >>>>>    #5  0x00000001007a487c in property_set_bool (obj=0x101796330, v=<optimized out>, name=<optimized out>, 
> >>>>>    #6  0x00000001007a7878 in object_property_set (obj=0x101796330, v=0x7fff70033110, 
> >>>>>    #7  0x00000001007aaf1c in object_property_set_qobject (obj=0x101796330, value=<optimized out>, 
> >>>>>    #8  0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, value=<optimized out>, 
> >>>>>    #9  0x00000001005fcdd8 in device_unparent (obj=0x101796330)
> >>>>>    #10 0x00000001007a6dd0 in object_finalize_child_property (obj=<optimized out>, name=<optimized out>, 
> >>>>>    #11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, child=0x101796330, 
> >>>>>    #12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb (dev=0x101796330)
> >>>>>    #13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0)
> >>>>>    #14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0)
> >>>>>    #15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0)
> >>>>>    #16 0x000000010042a50c in rtas_set_isolation_state (state=0, idx=<optimized out>)
> >>>>>
> >>>>>  May be we should call pci_del_option_rom() before
> >>>>>  pci_unregister_io_regions() ?
> >>>>
> >>>> This seems to make more sense to me.
> >>>>
> >>>> Meanwhile I assume the name pci_del_option_rom() is a bit misleading -
> >>>> it's not really deleting the ROM but unregistering the ROM only.
> >>>> Instead IIUC it's pci_unregister_io_regions() which deleted that. So
> >>>> maybe we can either rename the function pci_del_option_rom(), or we
> >>>> can pick the ROM destruction out of pci_unregister_io_regions() and
> >>>> put it into pci_del_option_rom() to make sure it's done as the last
> >>>> step?
> >>>
> >>> So it is a little more complex than I thought. 
> >>>
> >>> The PCI device is a vfio PCI device and the PCI ROM region is initialized 
> >>> in vfio_pci_size_rom() with memory_region_init_io(), which does not 
> >>> allocate the RAMBlock, but has_rom is still set to true. 
> >>>
> >>> When the device is deleted, pci_del_option_rom() is called and with it, vmstate_unregister_ram() because has_rom is set to true. Leading to the
> >>> SEGV.
> >>>
> >>> I am not sure how to handle this case. It seems that the realize routine 
> >>> of VFIOPCIDevice is hijacking a little the PCIDevice layer.
> >>
> >> Indeed.
> >>
> >> Then now I'm a bit confused on who actually deleted the ROM memory
> >> region that was created when pci_add_option_rom() was called.  It
> >> seems to be leaked.
> >>
> >> AFAIU the rest of the memory regions of the BARs (0-5) are managed by
> >> specific device emulation code, however this ROM memory region is
> >> managed by PCI subsystem.  Not sure whether that means we should
> >> destroy the region in PCI subsystem too, e.g. in pci_del_option_rom().
> >>
> >> And now I see this patch might be a valid fix for the VFIO-specific
> >> issue (though we might comment that a bit somewhere).
> > 
> > yes. I will send a v2 with an updated commit log.
> 
> I wonder if the fix is simply to... get rid of vmstate_unregister_ram.
> 
> It was added in
> 
>     commit b0e56e0b63f350691b52d3e75e89bb64143fbeff
>     Author: Hu Tao <hutao@cn.fujitsu.com>
>     Date:   Wed Apr 2 15:13:27 2014 +0800
> 
>     unset RAMBlock idstr when unregister MemoryRegion
> 
>     Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> whose commit message is a bit lacking, but
> http://lists.gnu.org/archive/html/qemu-devel/2014-04/msg00282.html helps
> more.  It seems like the original bug was a reference count issue.
> 
> Clearing the new migratable flag should also be unnecessary.

But even if we get rid of vmstate_unregister_ram(), the leak could
still be there?

I'm not sure what was leaked when b0e56e0b6 was introduced, I feel
like it's the RAMBlock of the memdev.  Here I think the ROM memory
region seems to be leaked as well (along with the RAMBlock inside)?

Regards,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] exec.c: check RAMBlock validity before changing its flag
  2018-07-04  9:55           ` Peter Xu
@ 2018-07-04 12:16             ` Paolo Bonzini
  2018-07-05  5:56               ` Cédric Le Goater
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2018-07-04 12:16 UTC (permalink / raw)
  To: Peter Xu
  Cc: Cédric Le Goater, qemu-devel, Dr . David Alan Gilbert,
	Peter Maydell, Michael S. Tsirkin, Alex Williamson, David Gibson

On 04/07/2018 11:55, Peter Xu wrote:
>>     commit b0e56e0b63f350691b52d3e75e89bb64143fbeff
>>     Author: Hu Tao <hutao@cn.fujitsu.com>
>>     Date:   Wed Apr 2 15:13:27 2014 +0800
>>
>>     unset RAMBlock idstr when unregister MemoryRegion
>>
>>     Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
>>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>
>> whose commit message is a bit lacking, but
>> http://lists.gnu.org/archive/html/qemu-devel/2014-04/msg00282.html helps
>> more.  It seems like the original bug was a reference count issue.
>>
>> Clearing the new migratable flag should also be unnecessary.
> But even if we get rid of vmstate_unregister_ram(), the leak could
> still be there?
> 
> I'm not sure what was leaked when b0e56e0b6 was introduced, I feel
> like it's the RAMBlock of the memdev.  Here I think the ROM memory
> region seems to be leaked as well (along with the RAMBlock inside)?

The leak would be another bug that vmstate_unregister_ram is just
papering over.  We need to test memory unplug with
vmstate_unregister_ram removed, and fix bugs if any.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] exec.c: check RAMBlock validity before changing its flag
  2018-07-04 12:16             ` Paolo Bonzini
@ 2018-07-05  5:56               ` Cédric Le Goater
  2018-07-05 12:01                 ` Paolo Bonzini
  0 siblings, 1 reply; 10+ messages in thread
From: Cédric Le Goater @ 2018-07-05  5:56 UTC (permalink / raw)
  To: Paolo Bonzini, Peter Xu
  Cc: qemu-devel, Dr . David Alan Gilbert, Peter Maydell,
	Michael S. Tsirkin, Alex Williamson, David Gibson

Hello Paolo,

On 07/04/2018 02:16 PM, Paolo Bonzini wrote:
> On 04/07/2018 11:55, Peter Xu wrote:
>>>     commit b0e56e0b63f350691b52d3e75e89bb64143fbeff
>>>     Author: Hu Tao <hutao@cn.fujitsu.com>
>>>     Date:   Wed Apr 2 15:13:27 2014 +0800
>>>
>>>     unset RAMBlock idstr when unregister MemoryRegion
>>>
>>>     Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
>>>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>>
>>> whose commit message is a bit lacking, but
>>> http://lists.gnu.org/archive/html/qemu-devel/2014-04/msg00282.html helps
>>> more.  It seems like the original bug was a reference count issue.
>>>
>>> Clearing the new migratable flag should also be unnecessary.
>> But even if we get rid of vmstate_unregister_ram(), the leak could
>> still be there?
>>
>> I'm not sure what was leaked when b0e56e0b6 was introduced, I feel
>> like it's the RAMBlock of the memdev.  Here I think the ROM memory
>> region seems to be leaked as well (along with the RAMBlock inside)?
> 
> The leak would be another bug that vmstate_unregister_ram is just
> papering over.  We need to test memory unplug with
> vmstate_unregister_ram removed, and fix bugs if any.

So for the time being, you would just get rid of pci_del_option_rom()
which only does vmstate_unregister_ram() ? 

Thanks,

C.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH] exec.c: check RAMBlock validity before changing its flag
  2018-07-05  5:56               ` Cédric Le Goater
@ 2018-07-05 12:01                 ` Paolo Bonzini
  0 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2018-07-05 12:01 UTC (permalink / raw)
  To: Cédric Le Goater, Peter Xu
  Cc: qemu-devel, Dr . David Alan Gilbert, Peter Maydell,
	Michael S. Tsirkin, Alex Williamson, David Gibson

On 05/07/2018 07:56, Cédric Le Goater wrote:
> Hello Paolo,
> 
> On 07/04/2018 02:16 PM, Paolo Bonzini wrote:
>> On 04/07/2018 11:55, Peter Xu wrote:
>>>>     commit b0e56e0b63f350691b52d3e75e89bb64143fbeff
>>>>     Author: Hu Tao <hutao@cn.fujitsu.com>
>>>>     Date:   Wed Apr 2 15:13:27 2014 +0800
>>>>
>>>>     unset RAMBlock idstr when unregister MemoryRegion
>>>>
>>>>     Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
>>>>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>>>
>>>> whose commit message is a bit lacking, but
>>>> http://lists.gnu.org/archive/html/qemu-devel/2014-04/msg00282.html helps
>>>> more.  It seems like the original bug was a reference count issue.
>>>>
>>>> Clearing the new migratable flag should also be unnecessary.
>>> But even if we get rid of vmstate_unregister_ram(), the leak could
>>> still be there?
>>>
>>> I'm not sure what was leaked when b0e56e0b6 was introduced, I feel
>>> like it's the RAMBlock of the memdev.  Here I think the ROM memory
>>> region seems to be leaked as well (along with the RAMBlock inside)?
>>
>> The leak would be another bug that vmstate_unregister_ram is just
>> papering over.  We need to test memory unplug with
>> vmstate_unregister_ram removed, and fix bugs if any.
> 
> So for the time being, you would just get rid of pci_del_option_rom()
> which only does vmstate_unregister_ram() ? 

Yes, I think so.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-07-05 12:01 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-01 17:19 [Qemu-devel] [PATCH] exec.c: check RAMBlock validity before changing its flag Cédric Le Goater
2018-07-02  3:57 ` Peter Xu
2018-07-03 12:45   ` Cédric Le Goater
2018-07-04  2:26     ` Peter Xu
2018-07-04  6:42       ` Cédric Le Goater
2018-07-04  9:34         ` Paolo Bonzini
2018-07-04  9:55           ` Peter Xu
2018-07-04 12:16             ` Paolo Bonzini
2018-07-05  5:56               ` Cédric Le Goater
2018-07-05 12:01                 ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.