Hi, Lianwei Wang writes: >> Lianwei Wang writes: >> > On Mon, Jun 17, 2019 at 5:40 AM Felipe Balbi wrote: >> >> >> >> Lianwei Wang writes: >> >> >> >> > The udc and gadget device will be deleted when udc device is >> >> > disconnected and the related function will be unbind with it. >> >> > >> >> > But if the configfs is not deleted, then the function object >> >> > will be kept and the bound status is kept true. >> >> > >> >> > Then after udc device is connected again and a new udc and >> >> > gadget objects will be created and passed to bind interface. >> >> > But because the bound is still true, the new gadget is not >> >> > updated to netdev and a previous freed gadget will be used >> >> > in netdev after bind. >> >> > >> >> > To fix this using after freed issue, always set the gadget >> >> > object to netdev in bind interface. >> >> > >> >> > Signed-off-by: Lianwei Wang >> >> >> >> I can't actually understand what's the problem here. The gadget is not >> >> deleted when we disconnect the cable. >> >> >> >> -- >> >> balbi >> > >> > The issue was observed with a dual-role capable USB controller (e.g. Intel >> > XHCI controller), which has the ability to switch role between host and device >> > mode. The gadget is deleted when we switch role to device mode from host >> > mode. See below log: >> > # echo p > /sys/devices/pci0000:00/0000:00:15.1/intel-cht-otg.0/mux_state #(4.4) >> >> oh, so you're using a modified tree :-) Then we can't really help. >> >> > [ 41.170891] intel-cht-otg intel-cht-otg.0: p: set PERIPHERAL mode >> > [ 41.171895] dwc3 dwc3.0.auto: DWC3 OTG Notify USB_EVENT_VBUS >> > [ 41.187420] dwc3 dwc3.0.auto: dwc3_resume_common >> > [ 41.191192] usb 1-1: USB disconnect, device number 3 >> > [ 41.191284] usb 1-1.1: USB disconnect, device number 4 >> > [ 41.218958] usb 1-1.5: USB disconnect, device number 5 >> > [ 41.238117] android_work: sent uevent USB_STATE=CONFIGURED >> > [ 41.240572] android_work: sent uevent USB_STATE=DISCONNECTED >> >> What is this android_work. That doesn't exist upstream. >> >> > [ 41.263285] platform dabr_udc.0: unregister gadget driver 'configfs-gadget' >> > [ 41.263413] configfs-gadget gadget: unbind function 'Function FS >> > Gadget'/ffff8801db049e38 >> > [ 41.263969] configfs-gadget gadget: unbind function >> > 'cdc_network'/ffff8801d8897400 >> > [ 41.325943] dabridge 1-1.5:1.0: Port 3 VBUS OFF >> > [ 41.720957] dabr_udc deleted >> > [ 41.721097] dabridge 1-5 deleted >> > >> > The UDC and gadget will be deleted after switch role to device mode. >> > And they will be >> > created as new object when switching back to host mode. At this time >> > the bind in function >> > driver (e.g. f_ncm) will not set the new gadget. >> > >> > For kernel 4.19+, the role switch command will be: >> > echo "device" > /sys/class/usb_role/intel_xhci_usb_sw-role-switch/role >> > >> > The latest Intel role switch kernel driver can be found here: >> > https://elixir.bootlin.com/linux/v5.2-rc5/source/drivers/usb/roles/intel-xhci-usb-role-switch.c >> >> Right, please test against v5.2-rc5 and show me the problem on that >> kernel. I can't apply patches for problems that may not even exist in >> upstream, sorry. >> >> -- >> balbi > > The issue exist in main line kernel, but I can not test it with > v5.2-rc5 kernel. I tested it with 4.19 kernel, which of the v4.19? > which, for the usb gadget part, has almost the same code as v5.2. It > is 100% reproducible with dual role > USB controller or by removing UDC hardware. Take f_ncm for example, > the use case as follows: Keep in mind that the way android handles dual-role is completely different from what we have upstream. > 1. USB controller is in host mode, f_ncm and UDC is configured in configfs. > - The ncm is instanced and alloced when "functions/ncm.usb0" is > created and it will be freed > when those files are delted from configfs. > > 2. enable the gadget and bind it to this ncm function. > - For the first time running, ncm_opts->bound is 0 and > gether_set_gadget is called to set the > gadget. The bound is set to 1 then. > > 3. If the UDC is disconnected from bus, then the UDC and its gadget is what do you mean by "disconnected from the bus"? Removing the cable (aka, disconnect) will only cause the ->disconnect() callback to be called. It will not result in the UDC being freed. Is this, perhaps, something specific to android? > deleted. But because the > ncm.usb0 is still there, ncm object is not freed and > ncm_opts->bound is still set. > There are two ways to disconnect the UDC hardware. One is for dual > role host controller by switch > host controller role to device mode. Another way is by removing > the UDC hardware from bus, both > will generate an usb device disconnect event to UDC driver to > delete udc and gadget. not true, unless I misunderstand what you mean. Disconnect will generate a disconnect interrupt in most UDCs (except for dummy) and the ->disconnect() callback will be called. Nothing will be freed. > 4. Now the bound of ncm is still set and gadget is deleted due to udc > disconnected. And if we connect > the udc device again, then it will create new udc and gadget and > bind to ncm again. But because > bound is already set, the new gadget is not set to gether > (gether_register_netdev not called). > > Not sure if this is clear to you. Please review the scenario and the patch. This sounds a little like it's android-specific. Is your platform using dwc3? Can you capture tracepoints of the failure? ftrace_dump_on_oops will help getting the actual tracepoints in this case. cheers -- balbi