qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Guenter Roeck <linux@roeck-us.net>
To: Gerd Hoffmann <kraxel@redhat.com>
Cc: "Paolo Bonzini" <pbonzini@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>,
	qemu-devel@nongnu.org,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] ehci: Ensure that device is not NULL before calling usb_ep_get
Date: Tue, 20 Aug 2019 08:38:35 -0700	[thread overview]
Message-ID: <ffd106b7-2310-ac52-bc33-d03c6a387c39@roeck-us.net> (raw)
In-Reply-To: <20190813114203.z62dgyyneqcp3mru@sirius.home.kraxel.org>

On 8/13/19 4:42 AM, Gerd Hoffmann wrote:
> On Tue, Aug 06, 2019 at 06:23:38AM -0700, Guenter Roeck wrote:
>> On 8/2/19 7:11 AM, Gerd Hoffmann wrote:
>>> On Wed, Jul 31, 2019 at 01:08:50PM +0200, Philippe Mathieu-Daudé wrote:
>>>> On 7/30/19 7:45 PM, Guenter Roeck wrote:
>>>>> The following assert is seen once in a while while resetting the
>>>>> Linux kernel.
>>>>>
>>>>> qemu-system-x86_64: hw/usb/core.c:734: usb_ep_get:
>>>>> 	Assertion `dev != NULL' failed.
>>>>>
>>>>> The call to usb_ep_get() originates from ehci_execute().
>>>>> Analysis and debugging shows that p->queue->dev can indeed be NULL
>>>>> in this function. Add check for this condition and return an error
>>>>> if it is seen.
>>>>
>>>> Your patch is not wrong as it corrects your case, but I wonder why we
>>>> get there. This assert seems to have catched a bug.
>>>
>>> https://bugzilla.redhat.com//show_bug.cgi?id=1715801 maybe.
>>>
>>>> Gerd, shouldn't we call usb_packet_cleanup() in ehci_reset() rather than
>>>> ehci_finalize()? Then we shouldn't need this patch.
>>>
>>> The two ehci_queues_rip_all() calls in ehci_reset() should clean up everything
>>> properly.
>>>
>>> Can you try the patch below to see whenever a ehci_find_device failure is the
>>> root cause?
>>>
>>> thanks,
>>>     Gerd
>>>
>>> diff --git a/hw/usb/hcd-ehci.c b/hw/usb/hcd-ehci.c
>>> index 62dab0592fa2..2b0a57772ed5 100644
>>> --- a/hw/usb/hcd-ehci.c
>>> +++ b/hw/usb/hcd-ehci.c
>>> @@ -1644,6 +1644,10 @@ static EHCIQueue *ehci_state_fetchqh(EHCIState *ehci, int async)
>>>            q->dev = ehci_find_device(q->ehci,
>>>                                      get_field(q->qh.epchar, QH_EPCHAR_DEVADDR));
>>>        }
>>> +    if (q->dev == NULL) {
>>> +        fprintf(stderr, "%s: device %d not found\n", __func__,
>>> +                get_field(q->qh.epchar, QH_EPCHAR_DEVADDR));
>>> +    }
>> Turns out I end up seeing that message hundreds of times early on each boot,
>> no matter which architecture. It is quite obviously a normal operating condition.
> 
> Yep, as long as the queue is not active this is completely harmless.
> So we need to check a bit later.  In execute() looks a bit too late
> though, we don't have a good backup plan then.
> 
> Does the patch below solve the problem without bad side effects?
> 
> thanks,
>    Gerd
> 
>>From 5980eaad23f675a2d509d0c55e288793619761bc Mon Sep 17 00:00:00 2001
> From: Gerd Hoffmann <kraxel@redhat.com>
> Date: Tue, 13 Aug 2019 13:37:09 +0200
> Subject: [PATCH] ehci: try fix queue->dev null ptr dereference
> 
> Reported-by: Guenter Roeck <linux@roeck-us.net>
> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
>   hw/usb/hcd-ehci.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/hw/usb/hcd-ehci.c b/hw/usb/hcd-ehci.c
> index 62dab0592fa2..5f089f30054b 100644
> --- a/hw/usb/hcd-ehci.c
> +++ b/hw/usb/hcd-ehci.c
> @@ -1834,6 +1834,9 @@ static int ehci_state_fetchqtd(EHCIQueue *q)
>               ehci_set_state(q->ehci, q->async, EST_EXECUTING);
>               break;
>           }
> +    } else if (q->dev == NULL) {
> +        ehci_trace_guest_bug(q->ehci, "no device attached to queue");
> +        ehci_set_state(q->ehci, q->async, EST_HORIZONTALQH);
>       } else {
>           p = ehci_alloc_packet(q);
>           p->qtdaddr = q->qtdaddr;
> 

That seems to be working as intended. I have not seen a crash
since I applied it. I tested it on top of v4.0 and v4.1.

Tested-by: Guenter Roeck <linux@roeck-us.net>

Guenter



  parent reply	other threads:[~2019-08-20 15:39 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-30 17:45 [Qemu-devel] [PATCH] ehci: Ensure that device is not NULL before calling usb_ep_get Guenter Roeck
2019-07-31 11:08 ` Philippe Mathieu-Daudé
2019-07-31 21:11   ` Guenter Roeck
2019-08-02 14:11   ` Gerd Hoffmann
2019-08-02 16:46     ` Guenter Roeck
2019-08-02 17:28       ` Guenter Roeck
2019-08-06 13:23     ` Guenter Roeck
2019-08-13 11:42       ` Gerd Hoffmann
2019-08-14 14:41         ` Guenter Roeck
2019-08-20 15:38         ` Guenter Roeck [this message]
2019-08-21  8:54           ` Gerd Hoffmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ffd106b7-2310-ac52-bc33-d03c6a387c39@roeck-us.net \
    --to=linux@roeck-us.net \
    --cc=kraxel@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).