From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752806AbcHZJWF (ORCPT ); Fri, 26 Aug 2016 05:22:05 -0400 Received: from mail-pa0-f66.google.com ([209.85.220.66]:36684 "EHLO mail-pa0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752541AbcHZJWD (ORCPT ); Fri, 26 Aug 2016 05:22:03 -0400 Date: Sat, 27 Aug 2016 01:21:52 +0800 From: Peter Chen To: Clemens Gruber Cc: linux-usb@vger.kernel.org, Greg Kroah-Hartman , linux-kernel@vger.kernel.org Subject: Re: chipidea: udc: kernel panic in isr_setup_status_phase Message-ID: <20160826172151.GA21360@b29397-desktop> References: <20160823003630.GA3052@archie.localdomain> <20160824081102.GA27233@shlinux2> <20160825234740.GA12850@archie.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160825234740.GA12850@archie.localdomain> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 26, 2016 at 01:47:40AM +0200, Clemens Gruber wrote: > On Wed, Aug 24, 2016 at 04:11:02PM +0800, Peter Chen wrote: > > UEI is an error interrupt, and software have not handled it, so it will > > not affect ci->status. > > > > > Should we only call isr_tr_complete_handler if UI && !UEI ? > > > > > > Or would adding a check for ci->status == NULL in isr_setup-status_phase > > > and returning an error code also be a good idea? > > > > I agree with that. > > OK I now return -EINVAL if (ci->status == NULL). This does fix the > kernel panic, but the usb0 interface stays down and does not work. > Should I send a patch to avoid the NULL pointer dereference now or after > we found the cause of ci->status being NULL in the first place? You can send the patch after we find the root cause. > > > > > > > > > Do you have an idea what's going on there and why ci->status is NULL? > > > > > > > I can't understand it, the only possible is the last disconnect event > > (see ci_udc_vbus_session->_gadget_stop_activity) has scheduled very late > > due to vbus lowers very slow. > > I now have more information about the two different behaviors. > I added some printk statements.. > > A) When it does not work: > ci_udc_vbus_session: is_active, gadget_ready=1 > ci_udc_pullup: is_on=1 > udc_irq: USBi_UI The gadget triggers UI interrupt due to host sends packet. I really can't understand that, why host does not send bus reset before sending packet (eg, GET_DESCRIPTOR)? It violates USB spec. Are you sure the first interrupt is UI when the vbus from off to on? Peter > isr_tr_complete_handler: when calling isr_setup_status_phase at i=8 > isr_setup_status_phase: ci: status is NULL, vbus_active=1, ep0_dir=TX > udc_irq: USBi_UI > isr_setup_packet_handler: USB_REQ_SET_ADDRESS, type=0, ci->status=NULL > isr_setup_status_phase: ci: status is NULL, vbus_active=1, ep0_dir=RX > (This then repeats a few times, beginning from udc_irq) > > B) When it works: > ci_udc_vbus_session: is_active=1 gadget_ready=1 > ci_udc_pullup: is_on=1 > udc_irq: USBi_SLI > _gadget_stop_activity > udc_irq: USBi_URI > udc_irq: USBi_PCI > udc_irq: USBi_UI > udc_irq: USBi_UI > _gadget_stop_activity > usb_ep_free_request > udc_irq: USBi_UI | USBi_URI > udc_irq: USBi_PCI > isr_setup_packet_handler: USB_REQ_SET_ADDRESS, ci->status is not NULL > udc_irq: USBi_UI > (The above repeats a few times from _gadget_stop_activity to USBi_UI) > (Then USBi_UI occurs many times) > configsfs-gadget gadget: high-speed config #1 .. > (More USBi_UI interrupts) > IPv6: ADDRCONF (NETDEV_CHANGE): usb0: link becomes ready > > -- > > So, both cases are very different and avoiding that NULL pointer > dereference did only fix the kernel panic but not the problem with the > USB gadget not initializing correctly after plugging in. > > In A) The USBi_UI interrupts shouldn't arrive that early, I suppose. If > they are the reason why the problem occured, the question is, what > triggered them? > > Does the printk output give you more insight into the problem? > > -- > > You mentioned the possibility that vbus lowers too slow, but vbus is > supplied externally by the host and the problem not only occurs when > the cable is plugged out and in again. Also at boot up when there were > no previous disconnect events. > Or did you mean something else with "vbus lowers too slow"? > > Do you have any suggestions how to approach this problem further? > Other spots where adding a printk would be helpful to find out what's > causing this? > > Regards, > Clemens -- Best Regards, Peter Chen