From: Andrey Konovalov <email@example.com> To: Alan Stern <firstname.lastname@example.org> Cc: syzbot <email@example.com>, Felipe Balbi <firstname.lastname@example.org>, email@example.com, Greg Kroah-Hartman <firstname.lastname@example.org>, LKML <email@example.com>, USB list <firstname.lastname@example.org>, syzkaller-bugs <email@example.com>, Dmitry Vyukov <firstname.lastname@example.org> Subject: Re: INFO: rcu detected stall in dummy_timer Date: Wed, 18 Sep 2019 13:22:49 +0200 Message-ID: <CAAeHK+yL9DX9oraq5dN8CSU6LzYN7PnRzppVbiG5b6QM5_ygEg@mail.gmail.com> (raw) In-Reply-To: <Pine.LNX.4.44L0.email@example.com> On Mon, Sep 16, 2019 at 6:32 PM Alan Stern <firstname.lastname@example.org> wrote: > > On Mon, 16 Sep 2019, Andrey Konovalov wrote: > > > On Fri, Sep 13, 2019 at 10:35 PM Alan Stern <email@example.com> wrote: > > > > > > On Fri, 13 Sep 2019, syzbot wrote: > > > > > > > syzbot has found a reproducer for the following crash on: > > > > > > > > HEAD commit: f0df5c1b usb-fuzzer: main usb gadget fuzzer driver > > > > git tree: https://github.com/google/kasan.git usb-fuzzer > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1146550d600000 > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=5c6633fa4ed00be5 > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=b24d736f18a1541ad550 > > > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11203fa5600000 > > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=162cd335600000 > > > > > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > > > Reported-by: firstname.lastname@example.org > > > > > > > > yurex 3-1:0.101: yurex_interrupt - unknown status received: -71 > > > > yurex 5-1:0.101: yurex_interrupt - unknown status received: -71 > > > > yurex 6-1:0.101: yurex_interrupt - unknown status received: -71 > > > > rcu: INFO: rcu_sched self-detected stall on CPU > > > > > > Andrey: > > > > > > This problem may be a result of overloading dummy_timer. The kernel > > > config you are using has CONFIG_HZ=100, but dummy-hcd needs > > > CONFIG_HZ=1000 (see the comment on line 1789). That is, lower values > > > of HZ will occasionally lead to trouble, and this may be an example. > > > > > > Can you change the config value for HZ and see if the bug still > > > reproduces? > > > > Hi Alan, > > > > I've tried running the reproducer with CONFIG_HZ=1000 and still got > > the same stall message. It's accompanied by countless "yurex > > 6-1:0.101: yurex_interrupt - unknown status received: -71" messages, > > so I believe this is an issue in the yurex driver. > > Maybe. Depends on exactly what the reproducer is doing, something > which is not at all easy to figure out from the scripts or programs. > > I got the impression that the reproducer connects an emulated yurex > device and then disconnects it -- but maybe that's not right at all. > Maybe the key point is that the reproducer sends a descriptor listing > an endpoint address that doesn't actually exist; that would have a > similar effect. Can you tell? (Trying to understand exactly what a > syzkaller test program does is not for the faint of heart.) > > As far as I can remember, the USB spec doesn't say what a device should > do when the host sends a packet to a non-existent endpoint. Which > means that some devices will do nothing at all, leading to the -71 > (-EPROTO) errors you see in the log. Indeed, there's only one place in > dummy_hcd.c where -EPROTO occurs -- for the case where an URB is sent > to an endpoint not supported by the gadget. > > This leads to the question: How should the yurex driver (or any USB > class driver, in fact) respond to a -EPROTO or similar error? The > thing is, this sort of error typically arises in two circumstances: > > The device was just unplugged, so of course it can't send > any packets back to the host; > > Noise on the bus caused a packet to be lost or corrupted. > > In the first case, it doesn't much matter what the driver does because > the disconnection will be noticed and acted on within a few hundred > milliseconds (although I suppose a driver could generate a lot of > kernel-log spam during that time). > > In the second case, retrying the lost/corrupted packet is the right > response. > > But retrying is _not_ the right response in cases where the device is > never going to respond because the endpoint address is invalid. This > can happen only in situations where the device provides incorrect > information (bad descriptors or something of that sort). The only > suitable approach I can think of is to limit the number of retries. > > Retry-limiting is not the sort of thing we want to add to each > individual USB class driver. Maybe it can be handled in the USB core; > I'll try to write some code for it. Under normal circumstances the > issue just doesn't arise, because normal devices aren't malicious. > > > Why does dumy_hcd require CONFIG_HZ=1000? The comment doesn't really > > explain the reason. > > Oh, that's simple enough. USB events tend to happen at millisecond > intervals. The data on the USB bus is organized into frames (and > microframes for high speed and SuperSpeed); a frame lasts one > millisecond (and a microframe lasts 1/8 ms). Many host controllers > report important events when a frame boundary occurs (that's how > dummy-hcd works). > > So for proper timing of the emulation, dummy-hcd requires timer > interrupts with millisecond resolution. I suppose the driver could be > changed to use a high-res timer instead of a normal kernel timer, but > for now that doesn't seem particularly important. So what are the practical differences between using CONFIG_HZ=100 and 1000 for dummy-hcd? Is is going to be slower or faster? Or can it get overloaded with data and cause stalls? Or something else? We're somewhat hesitant to change CONFIG_HZ as we don't know how it will affect other parts of the kernel (at some point the USB fuzzer will become a part of the main syzbot instance that doesn't only fuzz USB).
next prev parent reply index Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-08-09 13:18 syzbot 2019-09-13 19:56 ` syzbot 2019-09-13 20:35 ` Alan Stern 2019-09-16 15:29 ` Andrey Konovalov 2019-09-16 16:32 ` Alan Stern 2019-09-16 18:59 ` Greg Kroah-Hartman 2019-09-16 19:48 ` Alan Stern 2019-09-18 11:22 ` Andrey Konovalov [this message] 2019-09-18 14:16 ` Alan Stern 2019-09-16 19:53 ` Alan Stern 2019-09-16 20:13 ` syzbot 2019-09-17 16:47 ` [PATCH] USB: yurex: Don't retry on unexpected errors Alan Stern
Reply instructions: You may reply publically to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CAAeHK+yL9DX9oraq5dN8CSU6LzYN7PnRzppVbiG5b6QM5_ygEg@mail.gmail.com \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Linux-USB Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/linux-usb/0 linux-usb/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 linux-usb linux-usb/ https://lore.kernel.org/linux-usb \ email@example.com public-inbox-index linux-usb Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-usb AGPL code for this site: git clone https://public-inbox.org/public-inbox.git