All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] Register usb-uhci reset function.
@ 2009-06-16 12:47 Gleb Natapov
  2009-06-16 17:14 ` Paul Brook
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Gleb Natapov @ 2009-06-16 12:47 UTC (permalink / raw)
  To: qemu-devel

Update irq line on reset. Reseting irq line is required because
racing irq from pci device will call piix3_set_irq(). piix3_set_irq()
will remember current level in pci_irq_levels[]. The PIC line will be
triggered if one of pci_irq_levels[] is set (depends on piix3 config).
If for instance pci_irq_levels[0] and pci_irq_levels[1] are mapped to
the same PIC irq and during reset pci_irq_levels[1] == 1, but device
that drives pci_irq_levels[0] is initialized first the device driver
will not be able to lower irq line.

Without this patch RHEL4.8 hangs after reboot because usb interrupt
does not go away.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
diff --git a/hw/usb-uhci.c b/hw/usb-uhci.c
index 689d40a..2db9850 100644
--- a/hw/usb-uhci.c
+++ b/hw/usb-uhci.c
@@ -346,6 +346,7 @@ static void uhci_reset(UHCIState *s)
     }
 
     uhci_async_cancel_all(s);
+    uhci_update_irq(s);
 }
 
 static void uhci_save(QEMUFile *f, void *opaque)
@@ -1093,6 +1094,7 @@ void usb_uhci_piix3_init(PCIBus *bus, int devfn)
     }
     s->frame_timer = qemu_new_timer(vm_clock, uhci_frame_timer, s);
 
+    qemu_register_reset(uhci_reset, 0, s);
     uhci_reset(s);
 
     /* Use region 4 for consistency with real hardware.  BSD guests seem
@@ -1127,6 +1129,7 @@ void usb_uhci_piix4_init(PCIBus *bus, int devfn)
     }
     s->frame_timer = qemu_new_timer(vm_clock, uhci_frame_timer, s);
 
+    qemu_register_reset(uhci_reset, 0, s);
     uhci_reset(s);
 
     /* Use region 4 for consistency with real hardware.  BSD guests seem
--
			Gleb.

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-16 12:47 [Qemu-devel] [PATCH] Register usb-uhci reset function Gleb Natapov
@ 2009-06-16 17:14 ` Paul Brook
  2009-06-16 17:37   ` Gleb Natapov
  2009-06-16 18:02   ` Blue Swirl
  2009-06-16 19:19 ` Anthony Liguori
  2009-06-17  9:07 ` Filip Navara
  2 siblings, 2 replies; 21+ messages in thread
From: Paul Brook @ 2009-06-16 17:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: Gleb Natapov

On Tuesday 16 June 2009, Gleb Natapov wrote:
> Update irq line on reset. Reseting irq line is required because
> racing irq from pci device will call piix3_set_irq(). piix3_set_irq()
> will remember current level in pci_irq_levels[]. The PIC line will be
> triggered if one of pci_irq_levels[] is set (depends on piix3 config).
> If for instance pci_irq_levels[0] and pci_irq_levels[1] are mapped to
> the same PIC irq and during reset pci_irq_levels[1] == 1, but device
> that drives pci_irq_levels[0] is initialized first the device driver
> will not be able to lower irq line.

This is nonsense.

The only relevant circumstances are if the devices raises an IRQ, and is then 
reset by software while the system is running. It's got nothing to do with 
piix3, PCI bus interrupt sharing or system reset. If you are seeing problems 
after a system reset then your bug lies elsewhere.

Paul

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-16 17:14 ` Paul Brook
@ 2009-06-16 17:37   ` Gleb Natapov
  2009-06-16 18:41     ` Paul Brook
  2009-06-16 18:02   ` Blue Swirl
  1 sibling, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2009-06-16 17:37 UTC (permalink / raw)
  To: Paul Brook; +Cc: qemu-devel

On Tue, Jun 16, 2009 at 06:14:57PM +0100, Paul Brook wrote:
> On Tuesday 16 June 2009, Gleb Natapov wrote:
> > Update irq line on reset. Reseting irq line is required because
> > racing irq from pci device will call piix3_set_irq(). piix3_set_irq()
> > will remember current level in pci_irq_levels[]. The PIC line will be
> > triggered if one of pci_irq_levels[] is set (depends on piix3 config).
> > If for instance pci_irq_levels[0] and pci_irq_levels[1] are mapped to
> > the same PIC irq and during reset pci_irq_levels[1] == 1, but device
> > that drives pci_irq_levels[0] is initialized first the device driver
> > will not be able to lower irq line.
> 
> This is nonsense.
> 
The writeup below is nonsense. I described you circumstances. Show me
the code where it works different from what I described.

> The only relevant circumstances are if the devices raises an IRQ, and is then 
> reset by software while the system is running. It's got nothing to do with 
> piix3, PCI bus interrupt sharing or system reset. If you are seeing problems 
> after a system reset then your bug lies elsewhere.
> 
So you apparently never heard about hardware reset? You never saw buggy
guests that does not reset HW before reboot?

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-16 17:14 ` Paul Brook
  2009-06-16 17:37   ` Gleb Natapov
@ 2009-06-16 18:02   ` Blue Swirl
  1 sibling, 0 replies; 21+ messages in thread
From: Blue Swirl @ 2009-06-16 18:02 UTC (permalink / raw)
  To: Paul Brook; +Cc: qemu-devel, Gleb Natapov

On 6/16/09, Paul Brook <paul@codesourcery.com> wrote:
> On Tuesday 16 June 2009, Gleb Natapov wrote:
>  > Update irq line on reset. Reseting irq line is required because
>  > racing irq from pci device will call piix3_set_irq(). piix3_set_irq()
>  > will remember current level in pci_irq_levels[]. The PIC line will be
>  > triggered if one of pci_irq_levels[] is set (depends on piix3 config).
>  > If for instance pci_irq_levels[0] and pci_irq_levels[1] are mapped to
>  > the same PIC irq and during reset pci_irq_levels[1] == 1, but device
>  > that drives pci_irq_levels[0] is initialized first the device driver
>  > will not be able to lower irq line.
>
>
> This is nonsense.
>
>  The only relevant circumstances are if the devices raises an IRQ, and is then
>  reset by software while the system is running. It's got nothing to do with
>  piix3, PCI bus interrupt sharing or system reset. If you are seeing problems
>  after a system reset then your bug lies elsewhere.

No, the scenario Gleb describes is true, piix_pci.c has internal state
for the OR of the IRQs coming in and the state is not reset because
there is no reset handler. This is buggy.

The real fix is to implement piix3 reset correctly, not like Gleb
proposes now but 3/3 of his earlier patches.

As a less attractive alternative, piix3 could store the IRQ state for
each incoming IRQ, not OR state. Then all the devices would need to
call lower_irq to reach correct reset state at piix3.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-16 17:37   ` Gleb Natapov
@ 2009-06-16 18:41     ` Paul Brook
  2009-06-16 19:11       ` Gleb Natapov
  0 siblings, 1 reply; 21+ messages in thread
From: Paul Brook @ 2009-06-16 18:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: Gleb Natapov

>> If allow devices to be reset independently then they should probably set
>> theit IRQ output on reset.
>
>It is not "if" it is "when". We have to allow device to be reset
> independently for hot-unplug.

I'm not entirely convinced about reset-on-hotunplug. What about a device that 
pulls its IRQ line high on reset?

> > The only relevant circumstances are if the devices raises an IRQ, and is
> > then reset by software while the system is running. It's got nothing to
> > do with piix3, PCI bus interrupt sharing or system reset. If you are
> > seeing problems after a system reset then your bug lies elsewhere.
>
> So you apparently never heard about hardware reset? You never saw buggy
> guests that does not reset HW before reboot?

Clearing device state on full system reset makes sense because the whole point 
is that we're returning the system to its power-on state.
Lowering the IRQ when a single device is reset also makes sense.

However having the device explicit set its IRQ line during a full system reset 
is a different matter. This is probably harmless most of the time, and may 
paper over other bugs (e.g. the PCI bus not being reset properly). However I 
do not believe it is the correct justification for these changes.

Paul

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-16 18:41     ` Paul Brook
@ 2009-06-16 19:11       ` Gleb Natapov
  2009-06-16 19:38         ` Paul Brook
  0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2009-06-16 19:11 UTC (permalink / raw)
  To: Paul Brook; +Cc: qemu-devel

On Tue, Jun 16, 2009 at 07:41:05PM +0100, Paul Brook wrote:
> >> If allow devices to be reset independently then they should probably set
> >> theit IRQ output on reset.
> >
> >It is not "if" it is "when". We have to allow device to be reset
> > independently for hot-unplug.
> 
> I'm not entirely convinced about reset-on-hotunplug. What about a device that 
> pulls its IRQ line high on reset?
> 
There are several different resets for device. SW reset (triggered by
driver issuing a command), HW reset (triggered by external reset
signal). If they are different device should implement them differently.
HW reset should be called on device unplug. And if spec says that device
pulls its IRQ line high on HW reset then device emulation should do that
too.

> > > The only relevant circumstances are if the devices raises an IRQ, and is
> > > then reset by software while the system is running. It's got nothing to
> > > do with piix3, PCI bus interrupt sharing or system reset. If you are
> > > seeing problems after a system reset then your bug lies elsewhere.
> >
> > So you apparently never heard about hardware reset? You never saw buggy
> > guests that does not reset HW before reboot?
> 
> Clearing device state on full system reset makes sense because the whole point 
> is that we're returning the system to its power-on state.
Power-up state is not always the same as the state after triggering
reset on real HW. System reset in QEMU currently emulates neither of
them properly. And on real HW there are different power planes and they
can be powered down independently. QEMU can't emulates this neither.

> Lowering the IRQ when a single device is reset also makes sense.
>
It is much more simple then that. If spec says that IRQ line should be lowered
on reset we should code it this way. (Spec may say this indirectly. It
may say that IRQ level is an AND of N registers and on reset those
registers are cleared).
 
> However having the device explicit set its IRQ line during a full system reset 
> is a different matter. This is probably harmless most of the time, and may 
> paper over other bugs (e.g. the PCI bus not being reset properly). However I 
> do not believe it is the correct justification for these changes.
> 
The IRQ state inside piix3 code is QEMU implementation detail. There is
now such thing on real HW.

The real bug in case of usb-uhci is that it is not reset on system reset.
Are you arguing with that too, or just with IRQ level bits?

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-16 12:47 [Qemu-devel] [PATCH] Register usb-uhci reset function Gleb Natapov
  2009-06-16 17:14 ` Paul Brook
@ 2009-06-16 19:19 ` Anthony Liguori
  2009-06-16 19:26   ` Gleb Natapov
  2009-06-17  9:07 ` Filip Navara
  2 siblings, 1 reply; 21+ messages in thread
From: Anthony Liguori @ 2009-06-16 19:19 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: qemu-devel

Gleb Natapov wrote:
> Update irq line on reset. Reseting irq line is required because
> racing irq from pci device will call piix3_set_irq(). piix3_set_irq()
> will remember current level in pci_irq_levels[]. The PIC line will be
> triggered if one of pci_irq_levels[] is set (depends on piix3 config).
> If for instance pci_irq_levels[0] and pci_irq_levels[1] are mapped to
> the same PIC irq and during reset pci_irq_levels[1] == 1, but device
> that drives pci_irq_levels[0] is initialized first the device driver
> will not be able to lower irq line.
>
> Without this patch RHEL4.8 hangs after reboot because usb interrupt
> does not go away.
>   

This breaks the build because...

> @@ -346,6 +346,7 @@ static void uhci_reset(UHCIState *s)
> +    qemu_register_reset(uhci_reset, 0, s);
>   
qemu_register_reset takes void (*)(void *) and you're passing void 
(*)(UHCIState *).

-- 
Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-16 19:19 ` Anthony Liguori
@ 2009-06-16 19:26   ` Gleb Natapov
  0 siblings, 0 replies; 21+ messages in thread
From: Gleb Natapov @ 2009-06-16 19:26 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel

On Tue, Jun 16, 2009 at 02:19:44PM -0500, Anthony Liguori wrote:
> Gleb Natapov wrote:
>> Update irq line on reset. Reseting irq line is required because
>> racing irq from pci device will call piix3_set_irq(). piix3_set_irq()
>> will remember current level in pci_irq_levels[]. The PIC line will be
>> triggered if one of pci_irq_levels[] is set (depends on piix3 config).
>> If for instance pci_irq_levels[0] and pci_irq_levels[1] are mapped to
>> the same PIC irq and during reset pci_irq_levels[1] == 1, but device
>> that drives pci_irq_levels[0] is initialized first the device driver
>> will not be able to lower irq line.
>>
>> Without this patch RHEL4.8 hangs after reboot because usb interrupt
>> does not go away.
>>   
>
> This breaks the build because...
>
>> @@ -346,6 +346,7 @@ static void uhci_reset(UHCIState *s)
>> +    qemu_register_reset(uhci_reset, 0, s);
>>   
> qemu_register_reset takes void (*)(void *) and you're passing void  
> (*)(UHCIState *).
>
Tested it before -Werror :)

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-16 19:11       ` Gleb Natapov
@ 2009-06-16 19:38         ` Paul Brook
  2009-06-16 22:57           ` Zachary Amsden
  2009-06-17  8:12           ` Gleb Natapov
  0 siblings, 2 replies; 21+ messages in thread
From: Paul Brook @ 2009-06-16 19:38 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: qemu-devel

>What about the mantra that we should be as close to HW as possible?
>Reseting state only on a bus level will not work for hot-unplug anyway

Real hardware consists of electrical signals. Qemu emulates the logical 
effects of a device/subsystem, often at a much higher level. Thus "close to HW 
as possible" is no longer a straightforward concept.

Apart from anything else, I'm pretty sure than on all real hotplug PCI systems 
each socket is on its own PCI bus to allow electrical isolation of the device 
before physical hotplug occurs.

> > However having the device explicit set its IRQ line during a full system
> > reset is a different matter. This is probably harmless most of the time,
> > and may paper over other bugs (e.g. the PCI bus not being reset
> > properly). However I do not believe it is the correct justification for
> > these changes.
>
> The IRQ state inside piix3 code is QEMU implementation detail. There is
> now such thing on real HW.

Yes there is. It's a long strip of copper wire with a pullup resistor on the 
end.

Paul

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-16 19:38         ` Paul Brook
@ 2009-06-16 22:57           ` Zachary Amsden
  2009-06-17  8:12           ` Gleb Natapov
  1 sibling, 0 replies; 21+ messages in thread
From: Zachary Amsden @ 2009-06-16 22:57 UTC (permalink / raw)
  To: Paul Brook; +Cc: qemu-devel, Gleb Natapov

Paul Brook wrote:

>>> However having the device explicit set its IRQ line during a full system
>>> reset is a different matter. This is probably harmless most of the time,
>>> and may paper over other bugs (e.g. the PCI bus not being reset
>>> properly). However I do not believe it is the correct justification for
>>> these changes.
>> The IRQ state inside piix3 code is QEMU implementation detail. There is
>> now such thing on real HW.
> 
> Yes there is. It's a long strip of copper wire with a pullup resistor on the 
> end.

This is all nice, but what are we going to do about fixing the bug?

Zach

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-16 19:38         ` Paul Brook
  2009-06-16 22:57           ` Zachary Amsden
@ 2009-06-17  8:12           ` Gleb Natapov
  1 sibling, 0 replies; 21+ messages in thread
From: Gleb Natapov @ 2009-06-17  8:12 UTC (permalink / raw)
  To: Paul Brook; +Cc: qemu-devel

On Tue, Jun 16, 2009 at 08:38:24PM +0100, Paul Brook wrote:
> >What about the mantra that we should be as close to HW as possible?
> >Reseting state only on a bus level will not work for hot-unplug anyway
> 
> Real hardware consists of electrical signals. Qemu emulates the logical 
> effects of a device/subsystem, often at a much higher level. Thus "close to HW 
> as possible" is no longer a straightforward concept.
> 
You keep changing the story to suit your agenda. When I proposed a patch
to change qemu_irq() to propagate status of irq delivery back to the
caller you were against it because "this is not how real HW work" and no
amount of "Real hardware consists of electrical signals. Qemu emulates
the logical effects of a device/subsystem, often at a much higher
level." from other people haven't convince you a bit. But now, when it
comes from your mouth it is a gospel.

> Apart from anything else, I'm pretty sure than on all real hotplug PCI systems 
> each socket is on its own PCI bus to allow electrical isolation of the device 
> before physical hotplug occurs.
> 
> > > However having the device explicit set its IRQ line during a full system
> > > reset is a different matter. This is probably harmless most of the time,
> > > and may paper over other bugs (e.g. the PCI bus not being reset
> > > properly). However I do not believe it is the correct justification for
> > > these changes.
> >
> > The IRQ state inside piix3 code is QEMU implementation detail. There is
> > now such thing on real HW.
> 
> Yes there is. It's a long strip of copper wire with a pullup resistor on the 
> end.
> 
And guess who drives this long strip of copper wire?

a. Device.
b. Pci bridge
c. Lightning stroke

And BTW "pullup resistor" is there just in case no device is present.

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-16 12:47 [Qemu-devel] [PATCH] Register usb-uhci reset function Gleb Natapov
  2009-06-16 17:14 ` Paul Brook
  2009-06-16 19:19 ` Anthony Liguori
@ 2009-06-17  9:07 ` Filip Navara
  2009-06-17  9:43   ` Gleb Natapov
  2 siblings, 1 reply; 21+ messages in thread
From: Filip Navara @ 2009-06-17  9:07 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1395 bytes --]

On Tue, Jun 16, 2009 at 2:47 PM, Gleb Natapov <gleb@redhat.com> wrote:

> Update irq line on reset. Reseting irq line is required because
> racing irq from pci device will call piix3_set_irq(). piix3_set_irq()
> will remember current level in pci_irq_levels[]. The PIC line will be
> triggered if one of pci_irq_levels[] is set (depends on piix3 config).
> If for instance pci_irq_levels[0] and pci_irq_levels[1] are mapped to
> the same PIC irq and during reset pci_irq_levels[1] == 1, but device
> that drives pci_irq_levels[0] is initialized first the device driver
> will not be able to lower irq line.
>

I have been trying to stay away from the discussion for a long while, but I
can't keep it anymore. The patch is wrong. Since qemu_irq doesn't hold any
state, the information on reset has to be cleared on the places where the
state is maintained. Under no circumstances should any *_set_irq() function
should be called from reset handlers! Especially since the order of reset
handlers is not guaranteed. The reseting of the interrupt state in practice
means that interrupt status registers of individual devices should be
cleared, the PCI bus interrupt levels should be cleared - *in the PCI reset
handler* and so on. Eventually you will end up with reset handlers that
clear the state at every level, so there won't be any "hanging interrupts"
after reset.

Best regards,
Filip Navara

[-- Attachment #2: Type: text/html, Size: 1705 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-17  9:07 ` Filip Navara
@ 2009-06-17  9:43   ` Gleb Natapov
  2009-06-17 10:17     ` Filip Navara
  0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2009-06-17  9:43 UTC (permalink / raw)
  To: Filip Navara; +Cc: qemu-devel

On Wed, Jun 17, 2009 at 11:07:24AM +0200, Filip Navara wrote:
> On Tue, Jun 16, 2009 at 2:47 PM, Gleb Natapov <gleb@redhat.com> wrote:
> 
> > Update irq line on reset. Reseting irq line is required because
> > racing irq from pci device will call piix3_set_irq(). piix3_set_irq()
> > will remember current level in pci_irq_levels[]. The PIC line will be
> > triggered if one of pci_irq_levels[] is set (depends on piix3 config).
> > If for instance pci_irq_levels[0] and pci_irq_levels[1] are mapped to
> > the same PIC irq and during reset pci_irq_levels[1] == 1, but device
> > that drives pci_irq_levels[0] is initialized first the device driver
> > will not be able to lower irq line.
> >
> 
> I have been trying to stay away from the discussion for a long while, but I
> can't keep it anymore. The patch is wrong. Since qemu_irq doesn't hold any
> state, the information on reset has to be cleared on the places where the
The fact that qemu_irq() doesn't hold any state has nothing to do with
what should be done on device reset. Nothing at all, nada, zilch. You
can repeat this many times more and it will not became more relevant.
What is important is that only device knows what irq level should be at
any given moment, and qemu_irq() is the way to communicate this to the
system. And if it want to drive irq high on reset it should be able to
do that.

> state is maintained. Under no circumstances should any *_set_irq() function
> should be called from reset handlers! Especially since the order of reset
> handlers is not guaranteed. The reseting of the interrupt state in practice
> means that interrupt status registers of individual devices should be
> cleared, the PCI bus interrupt levels should be cleared - *in the PCI reset
> handler* and so on. Eventually you will end up with reset handlers that
> clear the state at every level, so there won't be any "hanging interrupts"
> after reset.
> 
This will not work for reseting individual device (needed by hot-unplug)
since pci chipset reset is not called. Instead of fixing problem at
the level that needs fixing (device reset level) you propose to hack
solution into piix3 code. "Yaeh, gdb shows we have a wrong value in some
random array, why is it there? Who cares, lest zero this thing and forget
about it." And BTW _I_ send patch to do just that a week or so ago, and
I think it should be applied along with reseting irq line in device
reset handler just to prevent buggy devices from hanging a guest.

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-17  9:43   ` Gleb Natapov
@ 2009-06-17 10:17     ` Filip Navara
  2009-06-17 11:06       ` Gleb Natapov
  0 siblings, 1 reply; 21+ messages in thread
From: Filip Navara @ 2009-06-17 10:17 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 4377 bytes --]

On Wed, Jun 17, 2009 at 11:43 AM, Gleb Natapov <gleb@redhat.com> wrote:

> On Wed, Jun 17, 2009 at 11:07:24AM +0200, Filip Navara wrote:
> > On Tue, Jun 16, 2009 at 2:47 PM, Gleb Natapov <gleb@redhat.com> wrote:
> >
> > > Update irq line on reset. Reseting irq line is required because
> > > racing irq from pci device will call piix3_set_irq(). piix3_set_irq()
> > > will remember current level in pci_irq_levels[]. The PIC line will be
> > > triggered if one of pci_irq_levels[] is set (depends on piix3 config).
> > > If for instance pci_irq_levels[0] and pci_irq_levels[1] are mapped to
> > > the same PIC irq and during reset pci_irq_levels[1] == 1, but device
> > > that drives pci_irq_levels[0] is initialized first the device driver
> > > will not be able to lower irq line.
> > >
> >
> > I have been trying to stay away from the discussion for a long while, but
> I
> > can't keep it anymore. The patch is wrong. Since qemu_irq doesn't hold
> any
> > state, the information on reset has to be cleared on the places where the
> The fact that qemu_irq() doesn't hold any state has nothing to do with
> what should be done on device reset. Nothing at all, nada, zilch. You
> can repeat this many times more and it will not became more relevant.


It has to do a lot with that - the qemu_irq abstraction has it's limits. And
there's a certain limit to which you can bend them. qemu_irq simulates
edges, not levels, so levels has to be emulated in the device infrastructure
in a way that doesn't necessarily match what happens in real HW. This also
means that in QEMU you actually have to build infrastructure for anything
that would cause the level change, such as device hot plug/unplug, and
communicate the current level as an edge.

What is important is that only device knows what irq level should be at
> any given moment, and qemu_irq() is the way to communicate this to the
> system.


In real HW, yes. In QEMU it's not the case with the current abstraction and
adding spurious qemu_set_irq calls won't change that.


> And if it want to drive irq high on reset it should be able to
> do that.


That's a fair argument. Doing so in reset callback is not the way to achieve
it though. With the current abstraction you'd need to add a secondary "late"
reset callback that would be called after all the normal reset callbacks are
processed. Anything else is horribly broken.

Consider a device connected to pins of two GPIO controllers. You would need
to ensure the GPIO controllers are in known state before qemu_set_irq is
called, otherwise they can't simulate the interrupt levels from the edge
information. If you did the reset in wrong order, the reset of the GPIO
controllers would discard the information about the pin level from the
device.


> > state is maintained. Under no circumstances should any *_set_irq()
> function
> > should be called from reset handlers! Especially since the order of reset
> > handlers is not guaranteed. The reseting of the interrupt state in
> practice
> > means that interrupt status registers of individual devices should be
> > cleared, the PCI bus interrupt levels should be cleared - *in the PCI
> reset
> > handler* and so on. Eventually you will end up with reset handlers that
> > clear the state at every level, so there won't be any "hanging
> interrupts"
> > after reset.
> >
> This will not work for reseting individual device (needed by hot-unplug)
> since pci chipset reset is not called.


Agreed. The fix to that is to properly call qemu_set_irq on hot-unplug (or
individual device reset for that matter).

Instead of fixing problem at
> the level that needs fixing (device reset level) you propose to hack
> solution into piix3 code.


That's not what I am proposing! I'm proposing to fix piix3 *system reset*
and implementing the necessary hot-unplug infrastructure for individual
device reset, which is very different thing from system reset.


> "Yaeh, gdb shows we have a wrong value in some
> random array, why is it there? Who cares, lest zero this thing and forget
> about it." And BTW _I_ send patch to do just that a week or so ago, and
> I think it should be applied along with reseting irq line in device
> reset handler just to prevent buggy devices from hanging a guest.
>

I didn't oppose patch 3/3 of your previous series. Fixing piix3 code should
definitely be done.

Best regards,
Filip Navara

[-- Attachment #2: Type: text/html, Size: 5802 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-17 10:17     ` Filip Navara
@ 2009-06-17 11:06       ` Gleb Natapov
  2009-06-17 11:25         ` Dor Laor
  2009-06-17 11:36         ` Filip Navara
  0 siblings, 2 replies; 21+ messages in thread
From: Gleb Natapov @ 2009-06-17 11:06 UTC (permalink / raw)
  To: Filip Navara; +Cc: qemu-devel

On Wed, Jun 17, 2009 at 12:17:53PM +0200, Filip Navara wrote:
> On Wed, Jun 17, 2009 at 11:43 AM, Gleb Natapov <gleb@redhat.com> wrote:
> 
> > On Wed, Jun 17, 2009 at 11:07:24AM +0200, Filip Navara wrote:
> > > On Tue, Jun 16, 2009 at 2:47 PM, Gleb Natapov <gleb@redhat.com> wrote:
> > >
> > > > Update irq line on reset. Reseting irq line is required because
> > > > racing irq from pci device will call piix3_set_irq(). piix3_set_irq()
> > > > will remember current level in pci_irq_levels[]. The PIC line will be
> > > > triggered if one of pci_irq_levels[] is set (depends on piix3 config).
> > > > If for instance pci_irq_levels[0] and pci_irq_levels[1] are mapped to
> > > > the same PIC irq and during reset pci_irq_levels[1] == 1, but device
> > > > that drives pci_irq_levels[0] is initialized first the device driver
> > > > will not be able to lower irq line.
> > > >
> > >
> > > I have been trying to stay away from the discussion for a long while, but
> > I
> > > can't keep it anymore. The patch is wrong. Since qemu_irq doesn't hold
> > any
> > > state, the information on reset has to be cleared on the places where the
> > The fact that qemu_irq() doesn't hold any state has nothing to do with
> > what should be done on device reset. Nothing at all, nada, zilch. You
> > can repeat this many times more and it will not became more relevant.
> 
> 
> It has to do a lot with that - the qemu_irq abstraction has it's limits. And
> there's a certain limit to which you can bend them. qemu_irq simulates
> edges, not levels, so levels has to be emulated in the device infrastructure
> in a way that doesn't necessarily match what happens in real HW. This also
> means that in QEMU you actually have to build infrastructure for anything
> that would cause the level change, such as device hot plug/unplug, and
> communicate the current level as an edge.
> 
Once again qemu_irq() is used to pass current irq level of the device to
the layer that simulate level interrupt: piix3. So level interrupt _is_
simulated, but the only one who nows what level should be at any given
moment is a device emulation itself.

> What is important is that only device knows what irq level should be at
> > any given moment, and qemu_irq() is the way to communicate this to the
> > system.
> 
> 
> In real HW, yes. In QEMU it's not the case with the current abstraction and
> adding spurious qemu_set_irq calls won't change that.
> 
Spurious? Irq level changed and you say qemu_set_irq() called to propagate
this change is spurious? uhci emulation does not track current irq
level it just calls uhci_update_irq() to figure it out and than calls
to qemu_set_irq(), so it should be perfectly fine to call qemu_set_irq()
even without level change.

> 
> > And if it want to drive irq high on reset it should be able to
> > do that.
> 
> 
> That's a fair argument. Doing so in reset callback is not the way to achieve
> it though. With the current abstraction you'd need to add a secondary "late"
> reset callback that would be called after all the normal reset callbacks are
> processed. Anything else is horribly broken.
> 
Yeah, reset callbacks all the way down.

> Consider a device connected to pins of two GPIO controllers. You would need
> to ensure the GPIO controllers are in known state before qemu_set_irq is
> called, otherwise they can't simulate the interrupt levels from the edge
> information. If you did the reset in wrong order, the reset of the GPIO
> controllers would discard the information about the pin level from the
> device.
Calling qemu_irq() on reset should be done only for level interrupt
obviously. I am not suggesting it should be done for every device/irq.

> > > state is maintained. Under no circumstances should any *_set_irq()
> > function
> > > should be called from reset handlers! Especially since the order of reset
> > > handlers is not guaranteed. The reseting of the interrupt state in
> > practice
> > > means that interrupt status registers of individual devices should be
> > > cleared, the PCI bus interrupt levels should be cleared - *in the PCI
> > reset
> > > handler* and so on. Eventually you will end up with reset handlers that
> > > clear the state at every level, so there won't be any "hanging
> > interrupts"
> > > after reset.
> > >
> > This will not work for reseting individual device (needed by hot-unplug)
> > since pci chipset reset is not called.
> 
> 
> Agreed. The fix to that is to properly call qemu_set_irq on hot-unplug (or
> individual device reset for that matter).
> 
> Instead of fixing problem at
> > the level that needs fixing (device reset level) you propose to hack
> > solution into piix3 code.
> 
> 
> That's not what I am proposing! I'm proposing to fix piix3 *system reset*
> and implementing the necessary hot-unplug infrastructure for individual
> device reset, which is very different thing from system reset.
> 
> 
So now we have:
1. system reset callback
2. late reset callback (so device can set its line properly after reset)
3. hot-unplug callback.

And it is not clear what part of device spec should be implemented in
any of them since real spec speaks another language.

> > "Yaeh, gdb shows we have a wrong value in some
> > random array, why is it there? Who cares, lest zero this thing and forget
> > about it." And BTW _I_ send patch to do just that a week or so ago, and
> > I think it should be applied along with reseting irq line in device
> > reset handler just to prevent buggy devices from hanging a guest.
> >
> 
> I didn't oppose patch 3/3 of your previous series. Fixing piix3 code should
> definitely be done.
> 
And what about 2/3? Or that part of state can stay intact after reset?

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-17 11:06       ` Gleb Natapov
@ 2009-06-17 11:25         ` Dor Laor
  2009-06-17 11:39           ` Gleb Natapov
  2009-06-17 11:36         ` Filip Navara
  1 sibling, 1 reply; 21+ messages in thread
From: Dor Laor @ 2009-06-17 11:25 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Filip Navara, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 6900 bytes --]

On 06/17/2009 02:06 PM, Gleb Natapov wrote:
> On Wed, Jun 17, 2009 at 12:17:53PM +0200, Filip Navara wrote:
>    
>> On Wed, Jun 17, 2009 at 11:43 AM, Gleb Natapov<gleb@redhat.com>  wrote:
>>
>>      
>>> On Wed, Jun 17, 2009 at 11:07:24AM +0200, Filip Navara wrote:
>>>        
>>>> On Tue, Jun 16, 2009 at 2:47 PM, Gleb Natapov<gleb@redhat.com>  wrote:
>>>>
>>>>          
>>>>> Update irq line on reset. Reseting irq line is required because
>>>>> racing irq from pci device will call piix3_set_irq(). piix3_set_irq()
>>>>> will remember current level in pci_irq_levels[]. The PIC line will be
>>>>> triggered if one of pci_irq_levels[] is set (depends on piix3 config).
>>>>> If for instance pci_irq_levels[0] and pci_irq_levels[1] are mapped to
>>>>> the same PIC irq and during reset pci_irq_levels[1] == 1, but device
>>>>> that drives pci_irq_levels[0] is initialized first the device driver
>>>>> will not be able to lower irq line.
>>>>>
>>>>>            
>>>> I have been trying to stay away from the discussion for a long while, but
>>>>          
>>> I
>>>        
>>>> can't keep it anymore. The patch is wrong. Since qemu_irq doesn't hold
>>>>          
>>> any
>>>        
>>>> state, the information on reset has to be cleared on the places where the
>>>>          
>>> The fact that qemu_irq() doesn't hold any state has nothing to do with
>>> what should be done on device reset. Nothing at all, nada, zilch. You
>>> can repeat this many times more and it will not became more relevant.
>>>        
>> It has to do a lot with that - the qemu_irq abstraction has it's limits. And
>> there's a certain limit to which you can bend them. qemu_irq simulates
>> edges, not levels, so levels has to be emulated in the device infrastructure
>> in a way that doesn't necessarily match what happens in real HW. This also
>> means that in QEMU you actually have to build infrastructure for anything
>> that would cause the level change, such as device hot plug/unplug, and
>> communicate the current level as an edge.
>>
>>      
> Once again qemu_irq() is used to pass current irq level of the device to
> the layer that simulate level interrupt: piix3. So level interrupt _is_
> simulated, but the only one who nows what level should be at any given
> moment is a device emulation itself.
>
>    
>> What is important is that only device knows what irq level should be at
>>      
>>> any given moment, and qemu_irq() is the way to communicate this to the
>>> system.
>>>        
>> In real HW, yes. In QEMU it's not the case with the current abstraction and
>> adding spurious qemu_set_irq calls won't change that.
>>
>>      
> Spurious? Irq level changed and you say qemu_set_irq() called to propagate
> this change is spurious? uhci emulation does not track current irq
> level it just calls uhci_update_irq() to figure it out and than calls
> to qemu_set_irq(), so it should be perfectly fine to call qemu_set_irq()
> even without level change.
>
>    
>>> And if it want to drive irq high on reset it should be able to
>>> do that.
>>>        
>> That's a fair argument. Doing so in reset callback is not the way to achieve
>> it though. With the current abstraction you'd need to add a secondary "late"
>> reset callback that would be called after all the normal reset callbacks are
>> processed. Anything else is horribly broken.
>>
>>      
> Yeah, reset callbacks all the way down.
>
>    
>> Consider a device connected to pins of two GPIO controllers. You would need
>> to ensure the GPIO controllers are in known state before qemu_set_irq is
>> called, otherwise they can't simulate the interrupt levels from the edge
>> information. If you did the reset in wrong order, the reset of the GPIO
>> controllers would discard the information about the pin level from the
>> device.
>>      
> Calling qemu_irq() on reset should be done only for level interrupt
> obviously. I am not suggesting it should be done for every device/irq.
>
>    
>>>> state is maintained. Under no circumstances should any *_set_irq()
>>>>          
>>> function
>>>        
>>>> should be called from reset handlers! Especially since the order of reset
>>>> handlers is not guaranteed. The reseting of the interrupt state in
>>>>          
>>> practice
>>>        
>>>> means that interrupt status registers of individual devices should be
>>>> cleared, the PCI bus interrupt levels should be cleared - *in the PCI
>>>>          
>>> reset
>>>        
>>>> handler* and so on. Eventually you will end up with reset handlers that
>>>> clear the state at every level, so there won't be any "hanging
>>>>          
>>> interrupts"
>>>        
>>>> after reset.
>>>>
>>>>          
>>> This will not work for reseting individual device (needed by hot-unplug)
>>> since pci chipset reset is not called.
>>>        
>> Agreed. The fix to that is to properly call qemu_set_irq on hot-unplug (or
>> individual device reset for that matter).
>>
>> Instead of fixing problem at
>>      
>>> the level that needs fixing (device reset level) you propose to hack
>>> solution into piix3 code.
>>>        
>> That's not what I am proposing! I'm proposing to fix piix3 *system reset*
>> and implementing the necessary hot-unplug infrastructure for individual
>> device reset, which is very different thing from system reset.
>>
>>
>>      
> So now we have:
> 1. system reset callback
> 2. late reset callback (so device can set its line properly after reset)
> 3. hot-unplug callback.
>
> And it is not clear what part of device spec should be implemented in
> any of them since real spec speaks another language.
>
>    
>>> "Yaeh, gdb shows we have a wrong value in some
>>> random array, why is it there? Who cares, lest zero this thing and forget
>>> about it." And BTW _I_ send patch to do just that a week or so ago, and
>>> I think it should be applied along with reseting irq line in device
>>> reset handler just to prevent buggy devices from hanging a guest.
>>>
>>>        
>> I didn't oppose patch 3/3 of your previous series. Fixing piix3 code should
>> definitely be done.
>>
>>      
> And what about 2/3? Or that part of state can stay intact after reset?
>    

Some general comments regarding this thread:
Most pci devices register a reset handler and reset the irq line just 
like this patch.
I propose to accept this patch first, since it solves a real problem 
(and not a theoretical one).

Once we fix this issue, we can address the general solution that will 
take care soft reset,
hard reset, device hotplug and kvm.
btw: for kvm to work, we need the set_irq abstraction notify the kernel, 
piix3 reset won't be enough.

btw2: a general comment - I hope that comments of the form "This is all 
wrong" will disappear, instead of
giving technical explanation. It makes the discussion emotional instead 
of pure technical and drives
contributers away and even worse!
> --
> 			Gleb.
>
>
>    


[-- Attachment #2: Type: text/html, Size: 8963 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-17 11:06       ` Gleb Natapov
  2009-06-17 11:25         ` Dor Laor
@ 2009-06-17 11:36         ` Filip Navara
  2009-06-17 12:12           ` Gleb Natapov
  1 sibling, 1 reply; 21+ messages in thread
From: Filip Navara @ 2009-06-17 11:36 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 8593 bytes --]

On Wed, Jun 17, 2009 at 1:06 PM, Gleb Natapov <gleb@redhat.com> wrote:

> On Wed, Jun 17, 2009 at 12:17:53PM +0200, Filip Navara wrote:
> > On Wed, Jun 17, 2009 at 11:43 AM, Gleb Natapov <gleb@redhat.com> wrote:
> >
> > > On Wed, Jun 17, 2009 at 11:07:24AM +0200, Filip Navara wrote:
> > > > On Tue, Jun 16, 2009 at 2:47 PM, Gleb Natapov <gleb@redhat.com>
> wrote:
> > > >
> > > > > Update irq line on reset. Reseting irq line is required because
> > > > > racing irq from pci device will call piix3_set_irq().
> piix3_set_irq()
> > > > > will remember current level in pci_irq_levels[]. The PIC line will
> be
> > > > > triggered if one of pci_irq_levels[] is set (depends on piix3
> config).
> > > > > If for instance pci_irq_levels[0] and pci_irq_levels[1] are mapped
> to
> > > > > the same PIC irq and during reset pci_irq_levels[1] == 1, but
> device
> > > > > that drives pci_irq_levels[0] is initialized first the device
> driver
> > > > > will not be able to lower irq line.
> > > > >
> > > >
> > > > I have been trying to stay away from the discussion for a long while,
> but
> > > I
> > > > can't keep it anymore. The patch is wrong. Since qemu_irq doesn't
> hold
> > > any
> > > > state, the information on reset has to be cleared on the places where
> the
> > > The fact that qemu_irq() doesn't hold any state has nothing to do with
> > > what should be done on device reset. Nothing at all, nada, zilch. You
> > > can repeat this many times more and it will not became more relevant.
> >
> >
> > It has to do a lot with that - the qemu_irq abstraction has it's limits.
> And
> > there's a certain limit to which you can bend them. qemu_irq simulates
> > edges, not levels, so levels has to be emulated in the device
> infrastructure
> > in a way that doesn't necessarily match what happens in real HW. This
> also
> > means that in QEMU you actually have to build infrastructure for anything
> > that would cause the level change, such as device hot plug/unplug, and
> > communicate the current level as an edge.
> >
> Once again qemu_irq() is used to pass current irq level of the device to
> the layer that simulate level interrupt: piix3. So level interrupt _is_
> simulated, but the only one who nows what level should be at any given
> moment is a device emulation itself.


That's why all the paths where the level in device emulation can change has
to be covered, so the information in piix3 stays accurate. What I'm saying
is that the reset callback is not one of the points where you can call
qemu_set_irq since the state of other devices is unknown at that point (for
the purpose of simulating interrupt levels). When the reset callbacks are
done with their work, all the simulated interrupt levels at all layers have
to be restored to known state. This state is currently with all the levels
set to zero. If you call qemu_set_irq during the reset callbacks the
interrupt levels recorded in the device tree may end up in uncertain state.
The fact that for PCI devices it works today is just a nice coincidence, if
the device "tree" was really a graph (imagine a cable from the PCI card to
some other device as it was in the SB16 cards) then it would suddenly break.


> > What is important is that only device knows what irq level should be at
> > > any given moment, and qemu_irq() is the way to communicate this to the
> > > system.
> >
> >
> > In real HW, yes. In QEMU it's not the case with the current abstraction
> and
> > adding spurious qemu_set_irq calls won't change that.
> >
> Spurious? Irq level changed and you say qemu_set_irq() called to propagate
> this change is spurious? uhci emulation does not track current irq
> level it just calls uhci_update_irq() to figure it out and than calls
> to qemu_set_irq(), so it should be perfectly fine to call qemu_set_irq()
> even without level change.


To change the level you first have to have the whole device tree topology in
known state. In the reset callback that's not the case. If there was a
"late" reset callback as mentioned earlier then that would be a place where
it could work.

Generally speaking, it is fine to call qemu_set_irq even if the level didn't
change, but only if the whole device tree topology has up-to-date interrupt
levels.


> > > And if it want to drive irq high on reset it should be able to
> > > do that.
> >
> >
> > That's a fair argument. Doing so in reset callback is not the way to
> achieve
> > it though. With the current abstraction you'd need to add a secondary
> "late"
> > reset callback that would be called after all the normal reset callbacks
> are
> > processed. Anything else is horribly broken.
> >
> Yeah, reset callbacks all the way down.
>
> > Consider a device connected to pins of two GPIO controllers. You would
> need
> > to ensure the GPIO controllers are in known state before qemu_set_irq is
> > called, otherwise they can't simulate the interrupt levels from the edge
> > information. If you did the reset in wrong order, the reset of the GPIO
> > controllers would discard the information about the pin level from the
> > device.
> Calling qemu_irq() on reset should be done only for level interrupt
> obviously. I am not suggesting it should be done for every device/irq.


I'm suggesting not to call it at all and always reset the recorded
"interrupt levels" all the way down the device tree on reset. If there is a
device which sets the IRQ line high on reset then we should introduce the
"late" reset/init callback.


> > > > state is maintained. Under no circumstances should any *_set_irq()
> > > function
> > > > should be called from reset handlers! Especially since the order of
> reset
> > > > handlers is not guaranteed. The reseting of the interrupt state in
> > > practice
> > > > means that interrupt status registers of individual devices should be
> > > > cleared, the PCI bus interrupt levels should be cleared - *in the PCI
> > > reset
> > > > handler* and so on. Eventually you will end up with reset handlers
> that
> > > > clear the state at every level, so there won't be any "hanging
> > > interrupts"
> > > > after reset.
> > > >
> > > This will not work for reseting individual device (needed by
> hot-unplug)
> > > since pci chipset reset is not called.
> >
> >
> > Agreed. The fix to that is to properly call qemu_set_irq on hot-unplug
> (or
> > individual device reset for that matter).
> >
> > Instead of fixing problem at
> > > the level that needs fixing (device reset level) you propose to hack
> > > solution into piix3 code.
> >
> >
> > That's not what I am proposing! I'm proposing to fix piix3 *system reset*
> > and implementing the necessary hot-unplug infrastructure for individual
> > device reset, which is very different thing from system reset.
> >
> >
> So now we have:
> 1. system reset callback
> 2. late reset callback (so device can set its line properly after reset)
> 3. hot-unplug callback.
>
> And it is not clear what part of device spec should be implemented in
> any of them since real spec speaks another language.


If it's not clear then let's make it clear. Real HW communicates interrupt
levels and detects edges, QEMU communicates edges and simulates levels, so
it has to work differently than on real HW in this particular case.

The system reset callback would reset device registers to a known state as
specified by the HW documentation. The interrupt levels would be driven low
at this point on the devices which simulate "interrupt levels" (buses, PIC,
etc.) - as if "0" was sent to the copper wire ;)

The "late" reset/init callback would drive the IRQs (and GPIOs) high if
necessary once the system is in a known state, ie. after all reset callbacks
have successfully completed.

Hot-unplug callback doesn't need reset the registers of the device, it only
has to drive the IRQ low on the bus - ie. call *_set_irq(0) for the specific
bus. In theory that could be done at the bus level.

Feel free to disagree or correct me. Suggestions welcome.


> > > "Yaeh, gdb shows we have a wrong value in some
> > > random array, why is it there? Who cares, lest zero this thing and
> forget
> > > about it." And BTW _I_ send patch to do just that a week or so ago, and
> > > I think it should be applied along with reseting irq line in device
> > > reset handler just to prevent buggy devices from hanging a guest.
> > >
> >
> > I didn't oppose patch 3/3 of your previous series. Fixing piix3 code
> should
> > definitely be done.
> >
> And what about 2/3? Or that part of state can stay intact after reset?
>

I don't remember the exact contents of 2/3, sorry.

Best regards,
Filip Navara

[-- Attachment #2: Type: text/html, Size: 10840 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-17 11:25         ` Dor Laor
@ 2009-06-17 11:39           ` Gleb Natapov
  2009-06-17 11:50             ` Filip Navara
  0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2009-06-17 11:39 UTC (permalink / raw)
  To: Dor Laor; +Cc: Filip Navara, qemu-devel

On Wed, Jun 17, 2009 at 02:25:24PM +0300, Dor Laor wrote:
> Some general comments regarding this thread:
> Most pci devices register a reset handler and reset the irq line just  
> like this patch.
> I propose to accept this patch first, since it solves a real problem  
> (and not a theoretical one).
>
uhci not registering reset handler is clearly a bug and I hope everyone
agrees that this should be fixed. I think the main opposition is against
add qemu_irq() call to the reset handler. And if patch to piix3 is
applied then qemu_irq() in uhci reset handler can be dropped (it
shouldn't IMHO but whatever). The question is why maintainers who now argue
that reset function to piix3 is the beset thing since sliced bread haven't
applied the patch when it was posted?

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-17 11:39           ` Gleb Natapov
@ 2009-06-17 11:50             ` Filip Navara
  0 siblings, 0 replies; 21+ messages in thread
From: Filip Navara @ 2009-06-17 11:50 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Dor Laor, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1560 bytes --]

On Wed, Jun 17, 2009 at 1:39 PM, Gleb Natapov <gleb@redhat.com> wrote:

> On Wed, Jun 17, 2009 at 02:25:24PM +0300, Dor Laor wrote:
> > Some general comments regarding this thread:
> > Most pci devices register a reset handler and reset the irq line just
> > like this patch.
> > I propose to accept this patch first, since it solves a real problem
> > (and not a theoretical one).
> >
> uhci not registering reset handler is clearly a bug and I hope everyone
> agrees that this should be fixed. I think the main opposition is against
> add qemu_irq() call to the reset handler. And if patch to piix3 is
> applied then qemu_irq() in uhci reset handler can be dropped (it
> shouldn't IMHO but whatever). The question is why maintainers who now argue
> that reset function to piix3 is the beset thing since sliced bread haven't
> applied the patch when it was posted?


+1 for dropping the set_irq in the reset handler and applying the piix3
patch.

I hope we can come to a consensus and finally get the patches in, the bug is
real and it's there. I can't speak for the maintainers since I am not one of
them.

The main reason why I argue for dropping the set_irq is that it's
conceptually wrong to call it. It took me a while to read all the mails
before I understood that and I am only trying to communicate the knowledge
further. I have been hardly bitten by other patches that fixed real problems
in a wrong way in the last month or two and I don't want this to happen
again if there is already a solution that is "more correct".

Best regards,
Filip Navara

[-- Attachment #2: Type: text/html, Size: 2248 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-17 11:36         ` Filip Navara
@ 2009-06-17 12:12           ` Gleb Natapov
  2009-06-17 13:03             ` Filip Navara
  0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2009-06-17 12:12 UTC (permalink / raw)
  To: Filip Navara; +Cc: qemu-devel

On Wed, Jun 17, 2009 at 01:36:08PM +0200, Filip Navara wrote:
> On Wed, Jun 17, 2009 at 1:06 PM, Gleb Natapov <gleb@redhat.com> wrote:
> 
> > On Wed, Jun 17, 2009 at 12:17:53PM +0200, Filip Navara wrote:
> > > On Wed, Jun 17, 2009 at 11:43 AM, Gleb Natapov <gleb@redhat.com> wrote:
> > >
> > > > On Wed, Jun 17, 2009 at 11:07:24AM +0200, Filip Navara wrote:
> > > > > On Tue, Jun 16, 2009 at 2:47 PM, Gleb Natapov <gleb@redhat.com>
> > wrote:
> > > > >
> > > > > > Update irq line on reset. Reseting irq line is required because
> > > > > > racing irq from pci device will call piix3_set_irq().
> > piix3_set_irq()
> > > > > > will remember current level in pci_irq_levels[]. The PIC line will
> > be
> > > > > > triggered if one of pci_irq_levels[] is set (depends on piix3
> > config).
> > > > > > If for instance pci_irq_levels[0] and pci_irq_levels[1] are mapped
> > to
> > > > > > the same PIC irq and during reset pci_irq_levels[1] == 1, but
> > device
> > > > > > that drives pci_irq_levels[0] is initialized first the device
> > driver
> > > > > > will not be able to lower irq line.
> > > > > >
> > > > >
> > > > > I have been trying to stay away from the discussion for a long while,
> > but
> > > > I
> > > > > can't keep it anymore. The patch is wrong. Since qemu_irq doesn't
> > hold
> > > > any
> > > > > state, the information on reset has to be cleared on the places where
> > the
> > > > The fact that qemu_irq() doesn't hold any state has nothing to do with
> > > > what should be done on device reset. Nothing at all, nada, zilch. You
> > > > can repeat this many times more and it will not became more relevant.
> > >
> > >
> > > It has to do a lot with that - the qemu_irq abstraction has it's limits.
> > And
> > > there's a certain limit to which you can bend them. qemu_irq simulates
> > > edges, not levels, so levels has to be emulated in the device
> > infrastructure
> > > in a way that doesn't necessarily match what happens in real HW. This
> > also
> > > means that in QEMU you actually have to build infrastructure for anything
> > > that would cause the level change, such as device hot plug/unplug, and
> > > communicate the current level as an edge.
> > >
> > Once again qemu_irq() is used to pass current irq level of the device to
> > the layer that simulate level interrupt: piix3. So level interrupt _is_
> > simulated, but the only one who nows what level should be at any given
> > moment is a device emulation itself.
> 
> 
> That's why all the paths where the level in device emulation can change has
> to be covered, so the information in piix3 stays accurate. What I'm saying
> is that the reset callback is not one of the points where you can call
> qemu_set_irq since the state of other devices is unknown at that point (for
> the purpose of simulating interrupt levels). When the reset callbacks are
I understand what you are saying, but the logic that emulates level irqs
from qemu_irq() edge (in piix3) is not a part of device emulation, so it
is not obvious to me that system reset should reset it and bring it to
unknown state. Line state change may reach IRQ chip before or after reset, but
this doesn't matter since IRQ chip will be in knows state in both cases.

> done with their work, all the simulated interrupt levels at all layers have
> to be restored to known state. This state is currently with all the levels
> set to zero. If you call qemu_set_irq during the reset callbacks the
> interrupt levels recorded in the device tree may end up in uncertain state.
That may happen on real HW too. OS will get spurious interrupt, that's it.

> The fact that for PCI devices it works today is just a nice coincidence, if
> the device "tree" was really a graph (imagine a cable from the PCI card to
> some other device as it was in the SB16 cards) then it would suddenly break.
> 
> 
> > > What is important is that only device knows what irq level should be at
> > > > any given moment, and qemu_irq() is the way to communicate this to the
> > > > system.
> > >
> > >
> > > In real HW, yes. In QEMU it's not the case with the current abstraction
> > and
> > > adding spurious qemu_set_irq calls won't change that.
> > >
> > Spurious? Irq level changed and you say qemu_set_irq() called to propagate
> > this change is spurious? uhci emulation does not track current irq
> > level it just calls uhci_update_irq() to figure it out and than calls
> > to qemu_set_irq(), so it should be perfectly fine to call qemu_set_irq()
> > even without level change.
> 
> 
> To change the level you first have to have the whole device tree topology in
> known state. In the reset callback that's not the case. If there was a
Level change (as seen by other device) can happen only after or before
reset, not during reset. In both cases the device state is known.

> "late" reset callback as mentioned earlier then that would be a place where
> it could work.
> 
> Generally speaking, it is fine to call qemu_set_irq even if the level didn't
> change, but only if the whole device tree topology has up-to-date interrupt
> levels.
> 
> 
> > > > And if it want to drive irq high on reset it should be able to
> > > > do that.
> > >
> > >
> > > That's a fair argument. Doing so in reset callback is not the way to
> > achieve
> > > it though. With the current abstraction you'd need to add a secondary
> > "late"
> > > reset callback that would be called after all the normal reset callbacks
> > are
> > > processed. Anything else is horribly broken.
> > >
> > Yeah, reset callbacks all the way down.
> >
> > > Consider a device connected to pins of two GPIO controllers. You would
> > need
> > > to ensure the GPIO controllers are in known state before qemu_set_irq is
> > > called, otherwise they can't simulate the interrupt levels from the edge
> > > information. If you did the reset in wrong order, the reset of the GPIO
> > > controllers would discard the information about the pin level from the
> > > device.
> > Calling qemu_irq() on reset should be done only for level interrupt
> > obviously. I am not suggesting it should be done for every device/irq.
> 
> 
> I'm suggesting not to call it at all and always reset the recorded
> "interrupt levels" all the way down the device tree on reset. If there is a
By doing this you are setting device irq line without asking device. It
will work since assumption that irq level is zero after reset is usually
correct.

> device which sets the IRQ line high on reset then we should introduce the
> "late" reset/init callback.
> 
Yes, calling set_irq(1) on reset may not achieve this since irq chip may
be reset after set_irq() was called.

> 
> > > > > state is maintained. Under no circumstances should any *_set_irq()
> > > > function
> > > > > should be called from reset handlers! Especially since the order of
> > reset
> > > > > handlers is not guaranteed. The reseting of the interrupt state in
> > > > practice
> > > > > means that interrupt status registers of individual devices should be
> > > > > cleared, the PCI bus interrupt levels should be cleared - *in the PCI
> > > > reset
> > > > > handler* and so on. Eventually you will end up with reset handlers
> > that
> > > > > clear the state at every level, so there won't be any "hanging
> > > > interrupts"
> > > > > after reset.
> > > > >
> > > > This will not work for reseting individual device (needed by
> > hot-unplug)
> > > > since pci chipset reset is not called.
> > >
> > >
> > > Agreed. The fix to that is to properly call qemu_set_irq on hot-unplug
> > (or
> > > individual device reset for that matter).
> > >
> > > Instead of fixing problem at
> > > > the level that needs fixing (device reset level) you propose to hack
> > > > solution into piix3 code.
> > >
> > >
> > > That's not what I am proposing! I'm proposing to fix piix3 *system reset*
> > > and implementing the necessary hot-unplug infrastructure for individual
> > > device reset, which is very different thing from system reset.
> > >
> > >
> > So now we have:
> > 1. system reset callback
> > 2. late reset callback (so device can set its line properly after reset)
> > 3. hot-unplug callback.
> >
> > And it is not clear what part of device spec should be implemented in
> > any of them since real spec speaks another language.
> 
> 
> If it's not clear then let's make it clear. Real HW communicates interrupt
> levels and detects edges, QEMU communicates edges and simulates levels, so
> it has to work differently than on real HW in this particular case.
> 
> The system reset callback would reset device registers to a known state as
> specified by the HW documentation. The interrupt levels would be driven low
> at this point on the devices which simulate "interrupt levels" (buses, PIC,
> etc.) - as if "0" was sent to the copper wire ;)
> 
> The "late" reset/init callback would drive the IRQs (and GPIOs) high if
> necessary once the system is in a known state, ie. after all reset callbacks
> have successfully completed.
> 
Basically you want to simulate "dead" period when devices ignore their inputs for
some time after reset (like on real HW).

> Hot-unplug callback doesn't need reset the registers of the device, it only
> has to drive the IRQ low on the bus - ie. call *_set_irq(0) for the specific
> bus. In theory that could be done at the bus level.
> 
I should be done at bus level indeed. The device is disconnected at that
point.

> Feel free to disagree or correct me. Suggestions welcome.
> 
Not disagreeing but this still is not enough, for instance there are
devices that have different state after power-up and reset.
 
> 
> > > > "Yaeh, gdb shows we have a wrong value in some
> > > > random array, why is it there? Who cares, lest zero this thing and
> > forget
> > > > about it." And BTW _I_ send patch to do just that a week or so ago, and
> > > > I think it should be applied along with reseting irq line in device
> > > > reset handler just to prevent buggy devices from hanging a guest.
> > > >
> > >
> > > I didn't oppose patch 3/3 of your previous series. Fixing piix3 code
> > should
> > > definitely be done.
> > >
> > And what about 2/3? Or that part of state can stay intact after reset?
> >
> 
> I don't remember the exact contents of 2/3, sorry.
> 
http://lists.gnu.org/archive/html/qemu-devel/2009-06/msg00342.html

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH] Register usb-uhci reset function.
  2009-06-17 12:12           ` Gleb Natapov
@ 2009-06-17 13:03             ` Filip Navara
  0 siblings, 0 replies; 21+ messages in thread
From: Filip Navara @ 2009-06-17 13:03 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 5050 bytes --]

On Wed, Jun 17, 2009 at 2:12 PM, Gleb Natapov <gleb@redhat.com> wrote:
[snip]

> >
> > That's why all the paths where the level in device emulation can change
> has
> > to be covered, so the information in piix3 stays accurate. What I'm
> saying
> > is that the reset callback is not one of the points where you can call
> > qemu_set_irq since the state of other devices is unknown at that point
> (for
> > the purpose of simulating interrupt levels). When the reset callbacks are
> I understand what you are saying, but the logic that emulates level irqs
> from qemu_irq() edge (in piix3) is not a part of device emulation, so it
> is not obvious to me that system reset should reset it and bring it to
> unknown state. Line state change may reach IRQ chip before or after reset,
> but
> this doesn't matter since IRQ chip will be in knows state in both cases.
>

Actually I want to bring it to a known state. It may not be equivalent to
what happens in real HW, but it's compatible behavior and deterministic,
which makes debugging easier.

> done with their work, all the simulated interrupt levels at all layers
> have
> > to be restored to known state. This state is currently with all the
> levels
> > set to zero. If you call qemu_set_irq during the reset callbacks the
> > interrupt levels recorded in the device tree may end up in uncertain
> state.
> That may happen on real HW too. OS will get spurious interrupt, that's it.


Agreed.

That's what I said earlier (or at least what I thought), you would get
spurious interrupts. It may have other consequences which I can't think of,
right now.

[snip]

> > To change the level you first have to have the whole device tree topology
> in
> > known state. In the reset callback that's not the case. If there was a
> Level change (as seen by other device) can happen only after or before
> reset, not during reset. In both cases the device state is known.


Hmm, makes sense. I suppose that if none of the bus/PIC/etc. devices altered
their "interrupt level" state during the reset and each individual device
brought it's IRQ line down it would work too. As long as the interrupts are
communicated to the CPU after the reset in a correct way. I'm just not
convinced that it's the right way. My rationale is that on power-up the
interrupt levels in PIC/PCI/etc. are set to zero and they only change as the
emulation is executed. So why should the reset be different? Doing it in a
different way on reset sounds fishy to me, especially since it involves more
code for no obvious benefit. Moreover APIC and lot of other devices clear
the interrupt state now, so you'd have to change them too.

[snip]

> > I'm suggesting not to call it at all and always reset the recorded
> > "interrupt levels" all the way down the device tree on reset. If there is
> a
> By doing this you are setting device irq line without asking device. It
> will work since assumption that irq level is zero after reset is usually
> correct.


Yes, but the emulation is not running at that point, so it's harmless.


> > device which sets the IRQ line high on reset then we should introduce the
> > "late" reset/init callback.
> >
> Yes, calling set_irq(1) on reset may not achieve this since irq chip may
> be reset after set_irq() was called.


Right, exactly my point.

[snip]

> > If it's not clear then let's make it clear. Real HW communicates
> interrupt
> > levels and detects edges, QEMU communicates edges and simulates levels,
> so
> > it has to work differently than on real HW in this particular case.
> >
> > The system reset callback would reset device registers to a known state
> as
> > specified by the HW documentation. The interrupt levels would be driven
> low
> > at this point on the devices which simulate "interrupt levels" (buses,
> PIC,
> > etc.) - as if "0" was sent to the copper wire ;)
> >
> > The "late" reset/init callback would drive the IRQs (and GPIOs) high if
> > necessary once the system is in a known state, ie. after all reset
> callbacks
> > have successfully completed.
> >
> Basically you want to simulate "dead" period when devices ignore their
> inputs for
> some time after reset (like on real HW).


Sort of, but in a deterministic way.


> > Hot-unplug callback doesn't need reset the registers of the device, it
> only
> > has to drive the IRQ low on the bus - ie. call *_set_irq(0) for the
> specific
> > bus. In theory that could be done at the bus level.
> >
> I should be done at bus level indeed. The device is disconnected at that
> point.


Agreed.


> > Feel free to disagree or correct me. Suggestions welcome.
> >
> Not disagreeing but this still is not enough, for instance there are
> devices that have different state after power-up and reset.


Let's document it once we (including the maintainers) agree on something. At
least that would set the precedent for future.

[snip]

> >
> > I don't remember the exact contents of 2/3, sorry.
> >
> http://lists.gnu.org/archive/html/qemu-devel/2009-06/msg00342.html
>

Looks good to me.

Best regards,
Filip Navara

[-- Attachment #2: Type: text/html, Size: 7347 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2009-06-17 13:04 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-16 12:47 [Qemu-devel] [PATCH] Register usb-uhci reset function Gleb Natapov
2009-06-16 17:14 ` Paul Brook
2009-06-16 17:37   ` Gleb Natapov
2009-06-16 18:41     ` Paul Brook
2009-06-16 19:11       ` Gleb Natapov
2009-06-16 19:38         ` Paul Brook
2009-06-16 22:57           ` Zachary Amsden
2009-06-17  8:12           ` Gleb Natapov
2009-06-16 18:02   ` Blue Swirl
2009-06-16 19:19 ` Anthony Liguori
2009-06-16 19:26   ` Gleb Natapov
2009-06-17  9:07 ` Filip Navara
2009-06-17  9:43   ` Gleb Natapov
2009-06-17 10:17     ` Filip Navara
2009-06-17 11:06       ` Gleb Natapov
2009-06-17 11:25         ` Dor Laor
2009-06-17 11:39           ` Gleb Natapov
2009-06-17 11:50             ` Filip Navara
2009-06-17 11:36         ` Filip Navara
2009-06-17 12:12           ` Gleb Natapov
2009-06-17 13:03             ` Filip Navara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.