All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/2] ohci: try to mimic real hardware command latency
@ 2015-12-16 13:39 Laurent Vivier
  2015-12-16 13:39 ` [Qemu-devel] [PATCH 1/2] ohci: delay first SOF interrupt Laurent Vivier
  2015-12-16 13:39 ` [Qemu-devel] [PATCH 2/2] ohci: clear pending SOF on suspend Laurent Vivier
  0 siblings, 2 replies; 5+ messages in thread
From: Laurent Vivier @ 2015-12-16 13:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: lvivier, Gerd Hoffmann

OHCI linux driver has some critical sections not protected against
device interrupts. Because of real hardware latency, it is generally
not a problem as interrupts cannot be triggered fast enough to happen
during these critical sections.

But theoretically, it can happen. And with QEMU used on an overcommitted
CPU, the vCPU becomes slow enough and it happens.

This series fixes a kernel crash on boot (CPU stuck) when the OHCI driver
tries to resume or suspend the device.

Laurent Vivier (2):
  ohci: delay first SOF interrupt
  ohci: clear pending SOF on suspend

 hw/usb/hcd-ohci.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH 1/2] ohci: delay first SOF interrupt
  2015-12-16 13:39 [Qemu-devel] [PATCH 0/2] ohci: try to mimic real hardware command latency Laurent Vivier
@ 2015-12-16 13:39 ` Laurent Vivier
  2015-12-17  9:25   ` Thomas Huth
  2015-12-16 13:39 ` [Qemu-devel] [PATCH 2/2] ohci: clear pending SOF on suspend Laurent Vivier
  1 sibling, 1 reply; 5+ messages in thread
From: Laurent Vivier @ 2015-12-16 13:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: lvivier, Gerd Hoffmann

On overcommitted CPU, kernel can be so slow that an interrupt can
be triggered by the device whereas the driver is not ready to receive
it. This drives us into an infinite loop.

This does not happen on real hardware because real hardware never send
interrupt immediately after the controller has been moved to OPERATION state.

This patch tries to delay the first SOF interrupt to let driver exits from
the critical section (which is not protected against interrupts...)

Some details:

- ohci_irq(): the OHCI interrupt handler, acknowledges the SOF IRQ
  only if the state of the driver (rh_state) is OHCI_STATE_RUNNING.
  So if this interrupt happens and the driver is not in this state,
  the function is called again and again, moving the system to a
  CPU starvation.

- ohci_rh_resume(): the driver re-enables operation with OHCI_USB_OPER.
  In QEMU this start the SOF timer and QEMU starts to send IRQs. As
  the driver is not in OHCI_STATE_RUNNING and not protected against IRQ,
  the ohci_irq() can be called and the driver never moved to
  OHCI_STATE_RUNNING.

Suggested-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
---
 hw/usb/hcd-ohci.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
index 7d65818..5f15ebb 100644
--- a/hw/usb/hcd-ohci.c
+++ b/hw/usb/hcd-ohci.c
@@ -1232,11 +1232,13 @@ static int ohci_service_ed_list(OHCIState *ohci, uint32_t head, int completion)
 }
 
 /* Generate a SOF event, and set a timer for EOF */
-static void ohci_sof(OHCIState *ohci)
+static void ohci_sof(OHCIState *ohci, bool first)
 {
     ohci->sof_time = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
     timer_mod(ohci->eof_timer, ohci->sof_time + usb_frame_time);
-    ohci_set_interrupt(ohci, OHCI_INTR_SF);
+    if (!first) {
+        ohci_set_interrupt(ohci, OHCI_INTR_SF);
+    }
 }
 
 /* Process Control and Bulk lists.  */
@@ -1318,7 +1320,7 @@ static void ohci_frame_boundary(void *opaque)
         ohci->done_count--;
 
     /* Do SOF stuff here */
-    ohci_sof(ohci);
+    ohci_sof(ohci, false);
 
     /* Writeback HCCA */
     if (ohci_put_hcca(ohci, ohci->hcca, &hcca)) {
@@ -1343,7 +1345,7 @@ static int ohci_bus_start(OHCIState *ohci)
 
     trace_usb_ohci_start(ohci->name);
 
-    ohci_sof(ohci);
+    ohci_sof(ohci, true);
 
     return 1;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH 2/2] ohci: clear pending SOF on suspend
  2015-12-16 13:39 [Qemu-devel] [PATCH 0/2] ohci: try to mimic real hardware command latency Laurent Vivier
  2015-12-16 13:39 ` [Qemu-devel] [PATCH 1/2] ohci: delay first SOF interrupt Laurent Vivier
@ 2015-12-16 13:39 ` Laurent Vivier
  2015-12-17  9:35   ` Thomas Huth
  1 sibling, 1 reply; 5+ messages in thread
From: Laurent Vivier @ 2015-12-16 13:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: lvivier, Gerd Hoffmann

On overcommitted CPU, kernel can be so slow that an interrupt can
be triggered by the device whereas the driver is not ready to receive
it. This drives us into an infinite loop.

On suspend, if a SOF interrupt is raised between the stop of the
device processing and the change of the device internal state to
OHCI_USB_SUSPEND (QEMU stops SOF timer on this state change), this
interrupt is never acknowledged.

This patch clears pending SOF interrupt on OHCI_USB_SUSPEND setting.

Some details:

- ohci_irq(): the OHCI interrupt handler, acknowledges the SOF IRQ
  only if the state of the driver (rh_state) is OHCI_STATE_RUNNING.
  So if this interrupt happens and the driver is not in this state,
  the function is called again and again, moving the system to a
  CPU starvation.

- ohci_rh_suspend(): the function stop the operation and acknowledge
  pending interrupts (but doesn't disable it). Later in the function,
  the device is moved to OHCI_SUSPEND_STATE, and the driver to
  OHCI_RH_SUSPENDED. If between the moment when the interrupt is
  acknowledged and the moment when the device is suspended a new
  interrupt is raised, it will be never acknowledged because the
  driver is now not in OHCI_RH_RUNNING state.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
---
 hw/usb/hcd-ohci.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
index 5f15ebb..b5a4e39 100644
--- a/hw/usb/hcd-ohci.c
+++ b/hw/usb/hcd-ohci.c
@@ -1438,6 +1438,9 @@ static void ohci_set_ctl(OHCIState *ohci, uint32_t val)
         break;
     case OHCI_USB_SUSPEND:
         ohci_bus_stop(ohci);
+        /* clear pending SF otherwise driver loops in ohci_irq() */
+        ohci->intr_status &= ~OHCI_INTR_SF;
+        ohci_intr_update(ohci);
         break;
     case OHCI_USB_RESUME:
         trace_usb_ohci_resume(ohci->name);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] ohci: delay first SOF interrupt
  2015-12-16 13:39 ` [Qemu-devel] [PATCH 1/2] ohci: delay first SOF interrupt Laurent Vivier
@ 2015-12-17  9:25   ` Thomas Huth
  0 siblings, 0 replies; 5+ messages in thread
From: Thomas Huth @ 2015-12-17  9:25 UTC (permalink / raw)
  To: Laurent Vivier, qemu-devel; +Cc: Gerd Hoffmann

On 16/12/15 14:39, Laurent Vivier wrote:
> On overcommitted CPU, kernel can be so slow that an interrupt can
> be triggered by the device whereas the driver is not ready to receive
> it. This drives us into an infinite loop.
> 
> This does not happen on real hardware because real hardware never send
> interrupt immediately after the controller has been moved to OPERATION state.
> 
> This patch tries to delay the first SOF interrupt to let driver exits from
> the critical section (which is not protected against interrupts...)
> 
> Some details:
> 
> - ohci_irq(): the OHCI interrupt handler, acknowledges the SOF IRQ
>   only if the state of the driver (rh_state) is OHCI_STATE_RUNNING.
>   So if this interrupt happens and the driver is not in this state,
>   the function is called again and again, moving the system to a
>   CPU starvation.
> 
> - ohci_rh_resume(): the driver re-enables operation with OHCI_USB_OPER.
>   In QEMU this start the SOF timer and QEMU starts to send IRQs. As
>   the driver is not in OHCI_STATE_RUNNING and not protected against IRQ,
>   the ohci_irq() can be called and the driver never moved to
>   OHCI_STATE_RUNNING.
> 
> Suggested-by: Gerd Hoffmann <kraxel@redhat.com>
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> ---
>  hw/usb/hcd-ohci.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
> index 7d65818..5f15ebb 100644
> --- a/hw/usb/hcd-ohci.c
> +++ b/hw/usb/hcd-ohci.c
> @@ -1232,11 +1232,13 @@ static int ohci_service_ed_list(OHCIState *ohci, uint32_t head, int completion)
>  }
>  
>  /* Generate a SOF event, and set a timer for EOF */

May I suggest to reflect the new behavior (with the explanation) in the
comment above? ... otherwise this might be hard to understand in a
couple of years if you only read the source code and not the changelog.

> -static void ohci_sof(OHCIState *ohci)
> +static void ohci_sof(OHCIState *ohci, bool first)
>  {
>      ohci->sof_time = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
>      timer_mod(ohci->eof_timer, ohci->sof_time + usb_frame_time);
> -    ohci_set_interrupt(ohci, OHCI_INTR_SF);
> +    if (!first) {
> +        ohci_set_interrupt(ohci, OHCI_INTR_SF);
> +    }
>  }
>  
>  /* Process Control and Bulk lists.  */
> @@ -1318,7 +1320,7 @@ static void ohci_frame_boundary(void *opaque)
>          ohci->done_count--;
>  
>      /* Do SOF stuff here */
> -    ohci_sof(ohci);
> +    ohci_sof(ohci, false);
>  
>      /* Writeback HCCA */
>      if (ohci_put_hcca(ohci, ohci->hcca, &hcca)) {
> @@ -1343,7 +1345,7 @@ static int ohci_bus_start(OHCIState *ohci)
>  
>      trace_usb_ohci_start(ohci->name);
>  
> -    ohci_sof(ohci);
> +    ohci_sof(ohci, true);
>  
>      return 1;
>  }

<bikeshedpainting>
Alternate idea: Split ohci_sof into two functions, e.g. like this:

static void ohci_sof_timer(OHCIState *ohci)
{
    ohci->sof_time = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
    timer_mod(ohci->eof_timer, ohci->sof_time + usb_frame_time);
}

static void ohci_sof(OHCIState *ohci)
{
    ohci_sof_timer(ohci);
    ohci_set_interrupt(ohci, OHCI_INTR_SF);
}

... and then only call ohci_sof_timer() in ohci_bus_start(). I think
that would be a little bit easier to read than the stuff with the
"first" parameter.
</bikeshedpainting>

Anyway, the patch looks basically like a good idea to me, so I'm also
fine with the original form if you don't want to change it.

 Thomas

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH 2/2] ohci: clear pending SOF on suspend
  2015-12-16 13:39 ` [Qemu-devel] [PATCH 2/2] ohci: clear pending SOF on suspend Laurent Vivier
@ 2015-12-17  9:35   ` Thomas Huth
  0 siblings, 0 replies; 5+ messages in thread
From: Thomas Huth @ 2015-12-17  9:35 UTC (permalink / raw)
  To: Laurent Vivier, qemu-devel; +Cc: Gerd Hoffmann

On 16/12/15 14:39, Laurent Vivier wrote:
> On overcommitted CPU, kernel can be so slow that an interrupt can
> be triggered by the device whereas the driver is not ready to receive
> it. This drives us into an infinite loop.
> 
> On suspend, if a SOF interrupt is raised between the stop of the
> device processing and the change of the device internal state to
> OHCI_USB_SUSPEND (QEMU stops SOF timer on this state change), this
> interrupt is never acknowledged.
> 
> This patch clears pending SOF interrupt on OHCI_USB_SUSPEND setting.
> 
> Some details:
> 
> - ohci_irq(): the OHCI interrupt handler, acknowledges the SOF IRQ
>   only if the state of the driver (rh_state) is OHCI_STATE_RUNNING.
>   So if this interrupt happens and the driver is not in this state,
>   the function is called again and again, moving the system to a
>   CPU starvation.
> 
> - ohci_rh_suspend(): the function stop the operation and acknowledge
>   pending interrupts (but doesn't disable it). Later in the function,
>   the device is moved to OHCI_SUSPEND_STATE, and the driver to
>   OHCI_RH_SUSPENDED. If between the moment when the interrupt is
>   acknowledged and the moment when the device is suspended a new
>   interrupt is raised, it will be never acknowledged because the
>   driver is now not in OHCI_RH_RUNNING state.
> 
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> ---
>  hw/usb/hcd-ohci.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
> index 5f15ebb..b5a4e39 100644
> --- a/hw/usb/hcd-ohci.c
> +++ b/hw/usb/hcd-ohci.c
> @@ -1438,6 +1438,9 @@ static void ohci_set_ctl(OHCIState *ohci, uint32_t val)
>          break;
>      case OHCI_USB_SUSPEND:
>          ohci_bus_stop(ohci);
> +        /* clear pending SF otherwise driver loops in ohci_irq() */

May I suggest to talk about "Linux driver" instead of only "driver"
here? ... QEMU also supports other guests, so the context might not be
clear otherwise.

> +        ohci->intr_status &= ~OHCI_INTR_SF;
> +        ohci_intr_update(ohci);
>          break;
>      case OHCI_USB_RESUME:
>          trace_usb_ohci_resume(ohci->name);

Apart from that nit in the comment, patch looks sane to me.

Reviewed-by: Thomas Huth <thuth@redhat.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-12-17  9:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-16 13:39 [Qemu-devel] [PATCH 0/2] ohci: try to mimic real hardware command latency Laurent Vivier
2015-12-16 13:39 ` [Qemu-devel] [PATCH 1/2] ohci: delay first SOF interrupt Laurent Vivier
2015-12-17  9:25   ` Thomas Huth
2015-12-16 13:39 ` [Qemu-devel] [PATCH 2/2] ohci: clear pending SOF on suspend Laurent Vivier
2015-12-17  9:35   ` Thomas Huth

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.