> > Am 24.07.2019 um 16:20 schrieb Greg KH : > > On Wed, Jul 24, 2019 at 03:27:51PM +0200, Maik Stohn wrote: >> KERNEL CRASH when using XHCI devices (affects any architecture, any USB device) >> >> This was already reported as a kernel bug in bugzilla (https://bugzilla.kernel.org/show_bug.cgi?id=204257) but I got told to report it here since it is usb related... >> >> Affected kernels: 5.2, 5.2.1, 5.2.2, 5.3-rc1, ... >> >> This bug is already causing real world problems with existing software and devices using SCSI BOT with raw SCSI commands and libusb software. >> >> Reproduce (tested on several different machines with 5.2,5.2.1,5.2.2,5.3rc1): >> >> - usb flash drive attached to XHCI controller (e.g. USB3.0 flash drive attached to USB3.0 port) >> - generic scsi module loaded (e.g. /dev/sg0 comes up when attaching the flash drive) >> - command line tool "sg_raw" from "sg3-utils" >> - execute: and press a key + return (-s1 sends one byte which is read from stdin) >> $ sudo sg_raw -s1 /dev/sg0 00 00 00 00 00 00 00 00 00 00 >> >> -> KERNEL Oops >> >> - same for -s2, -s3, ... up to -s8 (sending 1 to 8 bytes, exactly the maximum of bytes on my 64 bit machine where the "DMA bypass optimization / IDT" kicks in, see below) >> >> Since this can be triggered by any normal user (without any special USB device needed) I think it is important enough to fix it for the existing 5.2 kernel as well. >> >> --- >> >> Patch introducing the crash: https://patchwork.kernel.org/patch/10919167 / commit 33e39350ebd20fe6a77a51b8c21c3aa6b4a208cf - "usb: xhci: add Immediate Data Transfer support" >> >> Reason: NULL pointer dereference >> >> --- >> >> I took me quite some time to find the cause of this. >> >> I narrowed the crash down to the place inside of "xhci_queue_bulk_tx" in "xhci-ring.c" where the next SG is loaded >> >> ... >> while (sg && sent_len >= block_len) { >> /* New sg entry */ >> --num_sgs; >> sent_len -= block_len; >> if (num_sgs != 0) { >> sg = sg_next(sg); >> block_len = sg_dma_len(sg); <================= CRASH >> The comment of "sg_dma_len" clearly states "These macros should be used after a dma_map_sg call has been done..." - which was >> omitted by the new "xhci_map_urb_for_dma" function since the transfer was considered suitable for IDT. >> addr = (u64) sg_dma_address(sg); >> addr += sent_len; >> } >> } >> block_len -= sent_len; >> send_addr = addr; >> ... >> >> This only happens if the transfer was cosnideres suitable for IDT. >> When I patched the function "xhci_urb_suitable_for_idt" to always return false (nothing suitable for IDT) everything was working fine. >> >> >> Unfortunately I'm not deep enough into the inner workings of the kernel usb host driver to find a solution for this other than reverting the patch for IDT. > > What patch did you find that caused this regression? We can revert it > if that is the easiest thing to do. > > thanks, > > greg k-h I included the patch causing it above: https://patchwork.kernel.org/patch/10919167/ Greetings, Maik Stohn