From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1752806AbcHZJWF (ORCPT <rfc822;w@1wt.eu>);
        Fri, 26 Aug 2016 05:22:05 -0400
Received: from mail-pa0-f66.google.com ([209.85.220.66]:36684 "EHLO
        mail-pa0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1752541AbcHZJWD (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Fri, 26 Aug 2016 05:22:03 -0400
Date: Sat, 27 Aug 2016 01:21:52 +0800
From: Peter Chen <hzpeterchen@gmail.com>
To: Clemens Gruber <clemens.gruber@pqgruber.com>
Cc: linux-usb@vger.kernel.org,
        Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
        linux-kernel@vger.kernel.org
Subject: Re: chipidea: udc: kernel panic in isr_setup_status_phase
Message-ID: <20160826172151.GA21360@b29397-desktop>
References: <20160823003630.GA3052@archie.localdomain>
 <20160824081102.GA27233@shlinux2>
 <20160825234740.GA12850@archie.localdomain>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20160825234740.GA12850@archie.localdomain>
User-Agent: Mutt/1.5.24 (2015-08-30)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Fri, Aug 26, 2016 at 01:47:40AM +0200, Clemens Gruber wrote:
> On Wed, Aug 24, 2016 at 04:11:02PM +0800, Peter Chen wrote:
> > UEI is an error interrupt, and software have not handled it, so it will
> > not affect ci->status.
> > 
> > > Should we only call isr_tr_complete_handler if UI && !UEI ?
> > > 
> > > Or would adding a check for ci->status == NULL in isr_setup-status_phase
> > > and returning an error code also be a good idea?
> > 
> > I agree with that.
> 
> OK I now return -EINVAL if (ci->status == NULL). This does fix the
> kernel panic, but the usb0 interface stays down and does not work.
> Should I send a patch to avoid the NULL pointer dereference now or after
> we found the cause of ci->status being NULL in the first place?

You can send the patch after we find the root cause.

> 
> > 
> > > 
> > > Do you have an idea what's going on there and why ci->status is NULL?
> > > 
> > 
> > I can't understand it, the only possible is the last disconnect event
> > (see ci_udc_vbus_session->_gadget_stop_activity) has scheduled very late
> > due to vbus lowers very slow.
> 
> I now have more information about the two different behaviors.
> I added some printk statements..
> 
> A) When it does not work:
> ci_udc_vbus_session: is_active, gadget_ready=1
> ci_udc_pullup: is_on=1
> udc_irq: USBi_UI

The gadget triggers UI interrupt due to host sends packet.

I really can't understand that, why host does not send bus reset
before sending packet (eg, GET_DESCRIPTOR)? It violates USB spec.

Are you sure the first interrupt is UI when the vbus from off to on?

Peter

> isr_tr_complete_handler: when calling isr_setup_status_phase at i=8
>  isr_setup_status_phase: ci: status is NULL, vbus_active=1, ep0_dir=TX
> udc_irq: USBi_UI
> isr_setup_packet_handler: USB_REQ_SET_ADDRESS, type=0, ci->status=NULL
>  isr_setup_status_phase: ci: status is NULL, vbus_active=1, ep0_dir=RX
> (This then repeats a few times, beginning from udc_irq)
> 
> B) When it works:
> ci_udc_vbus_session: is_active=1 gadget_ready=1
> ci_udc_pullup: is_on=1
> udc_irq: USBi_SLI
> _gadget_stop_activity
> udc_irq: USBi_URI
> udc_irq: USBi_PCI
> udc_irq: USBi_UI
> udc_irq: USBi_UI
> _gadget_stop_activity
> usb_ep_free_request
> udc_irq: USBi_UI | USBi_URI
> udc_irq: USBi_PCI
> isr_setup_packet_handler: USB_REQ_SET_ADDRESS, ci->status is not NULL
> udc_irq: USBi_UI
> (The above repeats a few times from _gadget_stop_activity to USBi_UI)
> (Then USBi_UI occurs many times)
> configsfs-gadget gadget: high-speed config #1 ..
> (More USBi_UI interrupts)
> IPv6: ADDRCONF (NETDEV_CHANGE): usb0: link becomes ready
> 
> --
> 
> So, both cases are very different and avoiding that NULL pointer
> dereference did only fix the kernel panic but not the problem with the
> USB gadget not initializing correctly after plugging in.
> 
> In A) The USBi_UI interrupts shouldn't arrive that early, I suppose. If
> they are the reason why the problem occured, the question is, what
> triggered them?
> 
> Does the printk output give you more insight into the problem?
> 
> --
> 
> You mentioned the possibility that vbus lowers too slow, but vbus is
> supplied externally by the host and the problem not only occurs when
> the cable is plugged out and in again. Also at boot up when there were
> no previous disconnect events.
> Or did you mean something else with "vbus lowers too slow"?
> 
> Do you have any suggestions how to approach this problem further?
> Other spots where adding a printk would be helpful to find out what's
> causing this?
> 
> Regards,
> Clemens

-- 

Best Regards,
Peter Chen